How audio affects the QoE of audiovisual content

When a user is watching any video streaming content, it is not only important the quality of video. The sound, and its synchronization with images, could also make a big difference between two different players. In fact, it is always annoying if you experience that the images you’re watching are not related to the audio you’re listening. This issue could happen when the sound is received before the images or after them.

This problem is kwon as buffer and occurs when a viewer experienced a delay that equals or exceeds 1% of the video duration played. It is important to notice that viewers start to abandon a video if it takes more than 2 seconds to start up, with each incremental delay of 1 second resulting in a 5.8% increase in the abandonment rate, according to a study.

This problem is very related with the advent of digital television (DTV). Due to this technology, video processing has changed significantly. A rebuffering describes the moment when the video playback stopped because of an insufficient amount of downloaded video bytes (buffering) without any user interactions.

Nowadays, almost all videos are compressed, scaled, broadcasted or IP delivered, and decompressed. A roadmap that could increase the probability of buffering issues. It is not always a platform issue and could involve a user’s Internet connection problem. But providers should be aware when audio quality is getting in trouble to solve it and offer the best quality of service.

Sound is data

Just like images vary in quality and clarity, types of audio files differ in how large they are, how much information they contain, and what role they fill. While there are some exceptions, uncompressed files will contain the most information and therefore have the highest bitrate. Compressed lossy files generally have the least amount of information and therefore a lower bitrate.

Audio streaming works in an uncomplicated manner, the streaming service provides data to the streamer in minimal amounts so that the consumer can receive pre-buffered music a few seconds or minutes before the song playing. When the user has an internet connection which is good, streaming technology offers an undisturbed audio experience without file-saving on the user’s own device.

Research published in the Journal of the Audio Engineering Society found that low-quality audio made people perceive the quality of a video overall as much lower.


Video bitrate is how much data for your live stream is transferred within a second. when you live stream, usually you don’t just stream video, you stream audio. Most of the time, a good sample rate for digital audio is 44.1 kHz. Changing it is less straightforward than it is for bitrate.

Video bitrate affects video quality in several ways. First, it is the key measure of any video file size. Secondly, high video bitrate results in high video quality, and low bitrates result in poor video quality. However, using an extremely high bitrate is just a waste of bandwidth.

In general, a higher bitrate will accommodate higher image quality in the video output, only when comparing the same video with the same resolution. Bitrates should be expected to go up whenever the resolution goes up, as more data is being processed. Therefore, high video bitrate may provide excellent quality, but it can also place a major strain on your hardware which can result in stutters.

Audio bitrate defines the amount of data that is stored in the sound file you are listening to. Every audio file has a “bitrate” associated with it. Every second of an audio recording contains a certain amount of data or bits. When it comes to sound files this is calculated by the number of kilobits of data per second. For example, a 128 kbps (kilobits per second) file will have 128 kilobits stored for every second of audio.

The more kilobytes that are stored per second the higher the sound quality of the file. For the average listener, the quality will be defined by the strength and depth of low frequencies. It will also be defined by the crispness and clarity of high frequencies. More kilobits equals more data stored across the full frequency range.

A number international bodies are currently working in the field of audio video synchronisation. The reason for this is that the problem, decades old, is not solved, but getting worse. The introduction of digital broadcasting, high-definition broadcasting, set-top boxes with audio and video outputs to separate audio and video transducers, displays with significant signal processing inside, have all contributed to make it easy for huge errors to arise.

In conclusion

We already know that image quality, and avoiding its main distortions, are very important to guarantee the best Quality of Experience. But streamers must be sure that audio quality is algo giving the best experience possible.

In the battle for bit rate on digital broadcast networks, audio constantly has to defend itself against video. Audio data reduction ratios are typically between 5:1 and 10:1, whilst video could be 100:1.

As the quality of the video experience improves, with higher spatial resolutions, higher frame rates, and the inclusion of the third dimension the visual experience will become much more like “being there”. It will approach the illusion of reality that audio can already achieve. As a result, it seems inescapable that our expectations of the accuracy of audio-video synchronization can only increase, along with the technical complexity of maintaining it.

No Comments

Post A Comment