Translation of H.264/MPEG-4 Part 10 White Paper (1) Overview 04/10 Update SLTechnology News&Howtos

Translation of H.264/MPEG-4 Part 10 White Paper (1) Overview

2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

Translation of H.264/MPEG-4 Part 10 White Paper (1) Overview

H.264 Overview

1. Introduction

The emergence of digital television and DVD-video has revolutionized radio and television and home entertainment. More and more of these applications become possible with the standardization of video compression technology. The next standard of the MPGE series, MPEG4, is making a new generation of Internet-based video applications possible. Now ITU-T H.263 standard of video compression is widely used in video conference system.

MPEG4 and H.263 are standards based on video compression (video coding) technology (around 1995). Moving Picture expert Group and Video coding expert Group (MPEG and VCEG) are committed to developing a new standard with better performance than MPEG4 and H.263, and a better video image compression method with the characteristics of high quality and low bit video stream. The history of the new standard Advanced Video coding (AVC) dates back to 7 years ago.

In 1995, after the finalization of the H.263 standard for transmitting video signals over telephone lines, the ITU-T Video coding expert Group (VCEG) began to work in two more in-depth development areas: a "short-term" effort to add additional features of H.263 (to develop version 2 of the standard), and a "long-term" effort to develop a new standard for visual communications at low rates. In 2001, the ISO moving Picture expert Group (MPEG) realized the potential advantages of H.26L and formed the Video Joint working Group (JVT), including experts from MPEG and VCEG. The main task of JVT is to develop the draft H.26L "mode" into a complete international standard. In fact, the result is two standards: ISO MPEG4 part 10 and ITU-T H.264. The official name of the new standard is Advanced Video coding (AVC); however, the old H.26L and IH.264 [1] named after the ITU document number are better known.

2. H.264 codec

Like previous standards (such as MPEG1,MPEG2 and MPEG4), the draft H. 264 standard does not clearly define a codec. To some extent, the standard defines the syntax of video bitstream coding and corresponding decoding methods. In practice, however, a compliant encoder and decoder generally includes the functional modules shown in figures Figure 2-1 and Figure 2-2. At the same time, the functions shown in these figures are usually necessary, but codecs can still have quite a few variants. The basic functional modules (prediction, transmission, quantization, entropy coding) are similar to the previous standard (MPEG1,MPEG2,MPEG4,H.261,H.263). The most important change in H.264 is in the implementation details of these functional modules.

The encoder includes two data flow paths. A forward path (left to right, blue) and a refactoring path (right to left, magenta). The data stream path of the decoder is represented from right to left to illustrate the similarities between the encoder and the decoder.

2.1 Encoder (forward path)

When an input frame Fn is submitted for encoding. The frame is processed in macroblocks (the original image equivalent to 16X16 pixels). Each macroblock is encoded into intra-frame mode or inter-frame mode. In both cases, a prediction macroblock P based on the reconstructed frame is generated. In intra mode, P is generated based on samples in the current frame n that have been previously encoded, decoded, and reconstructed (represented by uF'n in the figure). Note that unfiltered samples are used to generate P). In interframe mode, P is generated according to a motion compensation prediction using one or more reference frames. In the figure, the reference frame is represented as a previously encoded frame FoundnMur1; however, the prediction of each macroblock may be generated based on one or more frames that have been encoded and reconstructed in the past or future (in time order).

The prediction P is subtracted from the current macroblock to produce a residual or differential macroblock Dn. It uses quantized transform coefficient set X transform (using block transform) and quantizes. These coefficients are reordered and entropy coded. The entropy coding coefficients and edge information needed in macroblock decoding (such as macroblock prediction mode, quantization step, motion vector describing how macroblock motion compensation, etc.) constitute the compressed bit stream. It is transmitted to the Network abstraction layer (NAL) for transmission or preservation.

2.2 Encoder (rebuild path)

In order to encode further macroblocks, a frame needs to be reconstructed by decoding the macroblock quantization coefficient X. The coefficient X is re-adjusted and inverted to produce a different macroblock Dn', which is different from the original differential macroblock Dn; it has loss in the quantization process, so Dn' is a distorted version of Dn.

It is predicted that macroblock P is added to the Dn' to create a reconstructed macroblock uF'n (a distorted version of the original macroblock). To reduce the effect of blocking distortion, a filter is used to reconstruct the reference frame from a series of macroblocks Fulln.

2.3 Decoder

The decoder receives a compressed bit stream from the NAL (Network abstraction layer). The data elements are decoded by entropy and rearranged to produce a set of quantization coefficients X. They are readjusted and reversed to generate Dn' (the same as the Dn' shown in the encoder). Using the header information decoded in the bit stream, the decoder generates a prediction macroblock P, which is the same as the original prediction frame P generated in the encoder. P is added to the Dn' to generate uFn',uF'n and filtered to generate a decoded macroblock Fn'.

The reconstruction path of the encoder should be cleared from the illustration and the above discussion, which is actually to ensure that the encoder and the decoder use the same reference frame to generate the prediction frame P. If not, the prediction frame P in the encoder and decoder will not be the same, resulting in an increasing error or "offset" between the encoder and the decoder.

3. Reference materials

1 ITU-T Rec. H.264 / ISO/IEC 11496-10, "Advanced Video Coding", Final Committee Draft, Document JVTE022,September 2002

Http://hi.baidu.com/huybin_wang/blog/item/0b9a97fa636d3dd8b48f31b4.html[@more@]

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.