Google Duo Uses Machine Learning To Improve Call Quality

Google Duo Uses Machine Learning To Improve Call Quality

People all across the world are relying more on audio and video calls given the current Coronavirus pandemic. However, there are also networking and audio issues that surface while trying to stay connected to people. To that end, Google recently announced WaveNetEQ, a new PLC system that’s aimed to improve the audio quality of Google Duo calls.
As stated in the Google AI blog, WaveNetEQ is a generative model based on DeepMind’s WaveRNN technology that’s trained with massive amounts of speech data. This speech data helps the model realistically continue short speech segments.
What this means is that during an online call, the data from these calls are divided into short chunks, called packets. However, during the transmission of these ‘packets’ from the sender to receiver, these packets often arrive in the wrong order creating jitter related issues. This leads to degradation in call quality. “99% of Google Duo calls need to deal with packet losses, excessive jitter or network delays. Of those calls, 20% lose more than 3% of the total audio duration due to network issues, and 10% of calls lose more than 8%,” reads the blog.
Google AI Blog
However, with WaveNetEQ, these issues are resolved since the large data collection process helps fix the missing packets during transmission. The WaveNetEQ model is fast enough and can run successfully on a phone. It also provides advanced audio quality. Duo is based on the WebRTC open source project and to hide the effects of packet loss, WebRTC’s NetEQ component makes use of ‘signal processing methods’, that can well analyze the speech and produce a smooth continuation that can works efficiently for small packet losses.

“To better manage packet loss, we replace the NetEQ PLC component with a modified version of WaveRNN, a recurrent neural network model for speech synthesis consisting of two parts, an autoregressive network and a conditioning network”, states the blog. Now, the role of an autoregressive network is to maintain the smooth flow of the signal while the ‘conditioning network’ controls and impacts the autoregressive network to produce audio consistency.
Google has been experimenting with WaveNetEQ for a while and it’s already available on Pixel 4 phones. It’s now also being launched in other phone models.


Popular Posts