Streaming Latency: What is It and When Does it Matter?

Latency, low latency, and ultra-low latency are increasingly hot topics in the streaming industry. As online video becomes a tool for more use cases, industry professionals must understand the latency requirements for different use cases and the technology that makes different latencies possible.

In this article, we will discuss the ins and outs of latency for video streaming. We will define latency before we dive into what causes it and why it matters in different streaming use cases. We’ll also shine a spotlight on the connection between various protocols and latency.

What is latency?

Latency is a metric in streaming that accounts for the time it takes for the stream’s data to transfer from the source to the end user. Transmitting the data causes a brief delay, impacting the user experience. Latency is not necessarily bad, but it has different implications in different use cases, which we will discuss shortly.

The most common type of latency is glass-to-glass latency or end-to-end latency. That is the time it takes between the moment that action happens (and is in front of the first glass, the camera) and the moment a viewer sees this action on their screen (the other glass). This definition of latency is handy for streaming live and interactive events.

A second type is protocol latency, which is the latency between the output of the encoder and the actual playback. This latency is interesting for low-latency applications where we do not want to compromise the encoder’s quality.

We also have the startup and channel change latency. These types of latency describe the time it takes to start a video or to change the channel once the command has been given. This is an essential parameter for streaming video applications that want to provide a leanback TV experience where viewers are used to instantaneously changing channels with a simple button push.

Relevant definitions of latency in different use cases

Not every latency is equally relevant for every use case. Here’s a breakdown of the relevance of different types of latency to different kinds of streaming.

USE CASE	STARTUP TIME	CHANNEL CHANGE TIME	PROTOCOL LATENCY	GLASS-TO-GLASS-LATENCY	DESCRIPTION
Broadcast	+++	+++	+++		Protocol latency is important to ensure simultaneous arrival on the main screen and OTT devices. Startup and channel change times are crucial to ensure a lean-back TV experience and that people stick to the service.
VOD	+++				Playback needs to start rapidly. VOD user interfaces are designed not to need fast channel changes. Protocol and glass-to-glass latency are not significant.
Live Events	++	(++)	Implicit	+++	Glass-to-glass latency is crucial. Startup latency is essential for every video service. Channel change is vital for large events with multiple stages or multiple cameras.
Video Call	++	(++)	Implicit	+++	Glass-to-glass latency is the primary criterion for video calls (and even more so for audio).
Interactive Events	++	(++)	Implicit	+++	Glass-to-glass latency is crucial for interactive events (though mostly slightly lower than video calls). Channel change is important for setups with multiple concurrent interactive events.

Table 1 – The Importance of Latency in Different Use Cases

Live broadcast

Broadcast traditionally focuses on the quality of the experience for a large audience. This calls for longer encoding times to ensure the best possible visual quality for a given bandwidth budget and fast start-up and channel change times and scalability.

However, latency is becoming increasingly important, driven by the desire to have the playback on online devices coincide with the existing broadcast distribution.

This industry is gradually moving to shorter segment sizes, LL DASH, and LL-HLS. However, this still does not provide an excellent solution with fast channel changes, as a trade-off must be made between start-up and channel change times.

Live event streaming

Live event streaming critically depends on low glass-to-glass latency, so traditional HLS and DASH protocols are not great options. Therefore, live event organizers are using WebRTC. This works well for small audiences, but the cost of scaling for WebRTC is high.

Consequently, live event organizers targeting mass audiences are looking for LL-DASH and LL-HLS when they can afford the increased latency. HESP can be helpful for this use case even though it yields slightly higher latencies than WebRTC. However, it has the same scaling characteristics as any HTTP-based approach, outperforming WebRTC.

However, bi-directional video conferencing uses webRTC since the real-time latency is the only way to go with this use case, regardless of cost.

Video-on-demand (VOD)

VOD traditionally focuses on the highest possible quality for the lowest number of bits. Fast startup is the only latency metric that impacts the on-demand user experience.

In addition to adopting the right streaming video approach, user interfaces use latency-hiding techniques to give viewers the impression of instantaneous startup times. This includes prefetching the video or starting at lower qualities so that the video is transferred faster and starts earlier.

Where is latency created in the video distribution chain?

Latency is introduced at many different steps in the video distribution chain. Let’s examine the technical aspects of video distribution that cause latency.

Encoding

Encoding is the first factor contributing to latency in the streaming process. It takes time and directly impacts glass-to-glass latency.

Encoding requires a use-case-dependent trade-off between latency, quality, and bitrate. Higher quality and a lower bitrate are typically preferred unless the glass-to-glass latency is crucial.

Content distribution

The second factor is the distribution networks between source and playback device, which adds to the latency and glass-to-glass, protocol, startup, and channel change latency. CDNs allow them to benefit from dedicated networks and reduce the overall load on the distribution network by caching as much as possible.

Video player

The player buffer adds to the latency as well. Players use buffers to cope with network variations and to avoid stalls.

As with encoding, a trade-off depends on the required glass-to-glass latency and the network’s strength. This is also true for startup latencies and channel change time, so you must identify the minimal amount of buffered video before the playback starts.

Streaming protocol

The streaming protocol you use also significantly impacts the different types of latency. It defines how the video is divided into transferred packets, directly impacting the buffer depth.

The effects of different streaming protocols on latency

Different streaming protocols all have a different glass-to-glass and protocol latency. Here’s a breakdown of the latencies associated with various protocols.

Using the traditional DASH and HLS protocols leads to large latencies. These latencies can be reduced by shortening the segments. However, the latency remains high because a segment is handled as an atomic piece of information. Segments are created, stored, and distributed as a whole.

LL DASH and LL-HLS overcome this problem by allowing a segment to be transferred piece-wise. A segment does not need to be entirely available before the first chunks can be transferred to the client for playback, significantly improving the latency.

For ultra-low latency, approaches are needed that allow for a continuous flow of images that are transferred as soon as they are available (rather than grouping them in chunks or segments). This can be done using webRTC and HESP.

HESP uses Chunked Transfer Encoding over HTTP, making the images available to the player on a per-image basis. That ensures that images, extremely rapidly after they are generated, are available at the client for playback.

Of course, this only gives one aspect of these protocols. The zapping time is also relevant. For DASH and HLS, this is a trade-off with the latency since playback can only start at segment boundaries. That implies that a player needs to choose between waiting for the most recent segment to start or starting playback of an already available segment. If latency is not critical, this allows for a speedy startup.

WebRTC allows for a shorter zapping time but is still bound to GOP size boundaries. HESP allows for ultra-low start and channel change times without compromising on latency. HESP does not rely on segments as the basic unit to start playback so that it can start playback at any image position. Plus, HESP uses range requests to tap into the stream of images made available for distribution as soon as they are created. Low latency and fast zapping are acceptable, but scalability is essential, especially for video services reaching tens or hundreds of thousands of concurrent viewers.

HTTP-based protocols (DASH, LL-DASH, HLS, LL-HLS, HRSP) have an edge over webRTC since HTTP-based approaches ensure the highest possible network reach, can tap into a wide range of efficient CDN solutions, and achieve scalability by file servers. WebRTC, on the other hand, relies on active video streaming between server and client and supports fewer viewers on an edge server than a regular CDN edge cache.

Real-life examples for understanding latency

To make this more concrete, let’s zoom in on a few examples of streaming setups to help you better understand the implications of different tech on latency.

Latency on OptiView Player (formerly THEOplayer)

OptiView Player reaches between one to three seconds of latency with LL-DASH and LL-HLS, depending on the player and stream configuration. A few years ago, THEO engaged with several customers worldwide, both for proofs of concept (PoCs) and real deployments, reaching latencies of around two seconds in real-life conditions.

Synamedia, Fastly, and THEO set up an end-to-end demonstrator with HESP, reaching out to a virtually unlimited number of viewers with sub-second protocol latency and zapping times well below 500ms.

Since THEO’s acquisition by Dolby, the player is now capable of real-time streaming with WebRTC, opening doors to new use cases that require real-time latency.

Other major streaming players

Dolby is just one of many companies that use innovative technology to stream at different latencies. Here’s how some other major players in the streaming space are using different protocols to achieve various levels of latency:

WebRTC is being used for video conference tools such as Google Meet.
YouTube Live brings live content with a delay of several seconds.
Wowza has a hybrid system for live events, using WebRTC for a limited number of ultra-low latency critical participants and LL-HLS for the rest. x

Final thoughts

Latency expectations vary from use case to use case, but luckily, different technologies are designed to accommodate a range of latency needs.

Since Dolby is compatible with various streaming protocols, our technology supports users with a wide variety of use cases. If you’re looking for a powerful video player to help you create stellar viewer experiences, we’ve got you covered. Contact us to learn how we can propel you toward your streaming goals.

Player

Advertising

Stream on demand

Streaming

Customer story

Streaming

Streaming Latency: What is It and When Does it Matter?

What is latency?