Lyra, a new, high-quality and very low-bitrate speech codec from Google

Connecting to others online via voice and video calls is something that is increasingly a part of everyday life.

The real-time communication frameworks, like WebRTC, that make this possible depend on efficient compression techniques, codecs, to encode (or decode) signals for transmission or storage.

A vital part of media applications for decades, codecs allow bandwidth-hungry applications to efficiently transmit data, and have led to an expectation of high-quality communication anywhere at any time.

As such, a continuing challenge in developing codecs, both for video and audio, is to provide increasing quality, using less data, and to minimize latency for real-time communication.

Even though video might seem much more bandwidth hungry than audio, modern video codecs can reach lower bitrates than some high-quality speech codecs used today.

Combining low-bitrate video and speech codecs can deliver a high-quality video call experience even in low-bandwidth networks. Yet historically, the lower the bitrate for an audio codec, the less intelligible and more robotic the voice signal becomes.

Furthermore, while some people have access to a consistent high-quality, high-speed network, this level of connectivity isn’t universal, and even those in well connected areas at times experience poor quality, low bandwidth, and congested network connections.

To solve this problem, we have created Lyra, a high-quality, very low-bitrate speech codec that makes voice communication available even on the slowest networks.

To do this, we’ve applied traditional codec techniques while leveraging advances in machine learning (ML) with models trained on thousands of hours of data to create a novel method for compressing and transmitting voice signals.