How to solve the contradiction between delay and stutter in one-to-one live broadcast technology? 04/29 Update SLTechnology News&Howtos

How to solve the contradiction between delay and stutter in one-to-one live broadcast technology?

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Since the LVB push end exists in a variety of different network environments: wired, wireless, 3G, 4G, satellite signals, etc., under these network conditions, how to achieve a smooth LVB? at this time, we need to introduce two strategies of variable bit rate and frame loss to ensure real-time push and effective data. We have previously introduced the processing strategy of switching bit rate and selective frame loss during live broadcasting. Here are some basic knowledge and solution principles.

Basic knowledge: I frame, B frame, P frame

I frames represent keyframes. You can understand it as the complete retention of this frame; decoding only needs the data of this frame to complete. (because it contains the full picture)

The P frame represents the difference between this frame and the previous Keyframe (or P frame). When decoding, the previously cached picture needs to be superimposed with the differences defined in this frame to generate the final picture. (that is, differential frame, P frame does not have complete picture data, only data that is different from the previous frame)

Frame B is a bi-directional differential frame. The B frame records the difference between this frame and the front and back frames (more complex, there are 4 cases). In other words, to decode the B frame, we should not only obtain the previous cache picture, but also decode the later picture, and obtain the final picture through the superposition of the front and back picture and the data of this frame.

B-frame compression ratio is high, but it will cost CPU when encoding and decoding, and it may increase the LVB delay in LVB, so B-frame is generally not used on mobile devices.

Keyframe caching strategy

A typical video frame sequence is IBBPBBPBBP.

For live broadcasting, B frames are usually not used in coding in order to reduce the delay of live broadcasting. P frame B frame is directly or indirectly dependent on I frame, so in order to decode a video frame sequence and play it, the player must first decode the I frame, and then the subsequent B frame and P frame can be decoded. In this way, how the server caches the key frames has a great impact on the delay of live broadcasting and other aspects.

A better strategy is for the server to automatically judge the interval of key frames, cache the frame sequence according to business requirements, and ensure that at least two or more key frames are stored in the cache to meet the needs of low delay, anti-stutter, intelligent packet loss and so on.

Compromise between delay and Catton

LVB delay and stutter are two indicators that are paid close attention to when analyzing the quality of LVB service. ILVB scenes are very sensitive to delay, while news and sports live broadcasts pay more attention to the fluency of broadcasting.

However, in theory, these two metrics are a pair of contradictory relationships-lower latency is required, which means that the buffers on both the server side and the player side must be shorter, and abnormal jitter from the network can easily cause stutters. Businesses can accept higher latency, and both server and player can have longer buffers to cope with jitter from the network and provide a smoother live broadcast experience.

Of course, for users with very good network conditions, these two items can be guaranteed at the same time, here is mainly for users whose network conditions are not so good, how to solve the problem of delay and stutter.

There are usually two techniques to balance and optimize these two indicators.

First, the server provides a flexible configuration strategy.

For those with more sensitive delay requirements, the server maintains a small buffer queue for each connection while ensuring key frames; for live broadcasts with higher stutter requirements, the length of the buffer queue is appropriately increased to ensure smooth playback.

Second, the server intelligently detects all the connected networks.

When the network is in good condition, the server will reduce the size of the buffer queue of the connection and reduce the delay; while when the network condition is poor, especially when the jitter is obvious, the server will increase the buffer queue length for the connection and give priority to ensuring the fluency of playback.

Packet loss strategy

When do I need to lose my bag?

For a connection with a good network connection and a relatively small delay, the packet loss strategy will never have a chance to exert its ability. For users with poor network connection, because the download speed is slower or the jitter is larger, the delay of this user will be higher and higher.

Another situation is that if the key frame interval of the live stream is relatively long, then if the first packet is guaranteed to be a key frame, the delay of the audience watching the program may reach the length of a key frame sequence. In both cases, you need to enable the packet loss policy to adjust the playback delay.

With regard to packet loss, there are two issues that need to be addressed:

The first is to correctly judge when packet loss is needed.

The second is how to lose packets to minimize the impact on the audience's playback experience. It is better for the back-end to periodically monitor the buffer queue length of all connections, so that the queue length forms a discrete functional relationship with time. The back-end uses a self-developed algorithm to analyze the discrete function to determine whether packet loss is needed.

The general frame dropping strategy is to discard a complete video frame sequence directly. This strategy seems simple, but it has a great impact on user playback experience. Instead, the background should use the strategy of gradually losing frames, losing the last one or two frames for each video frame sequence, so that the user's perception is minimum and the delay effect is gradually reduced smoothly.

These are the detailed principles of content caching and transmission strategy optimization.

SUNING Video Cloud, a subsidiary of Suning, has served more than 2000 customers. With ten years of PPTV media technology and service experience, SUNING Video Cloud is an one-stop SaaS service platform focused on video field built by combining streaming media technology, P2P, CDN distribution, mass storage, security strategy, etc. SUNING Video integrates video LVB, cloud VOD, cloud upload, cloud transcoding, cloud storage, cloud statistics and other functions, and supports customers' business needs of various video scenarios on multiple platforms.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.