[FFmpeg Practical Combat] GOP Structure

When using the HM encoder, we often use predefined configuration files. There are relevant definitions of GOP in the configuration file. The structure and related parameters of the GOP are defined here. The GOP structure is reused in the video sequence. The definition contains GOPSize lines, one frame per line in decoding order, so Frame1 is the first frame decoded, Frame2 is the second frame decoded, and so on. Moreover, the reference image of each frame is defined, including the reference image of the current frame and the reference image of subsequent frames. The encoder does not automatically calculate which frame will be used as a reference in the future, so it must be specified. Note that some of the reference images specified for the images after the IDR frame of the first GOP may not be available, and the encoder will handle this automatically.
The following is the relevant definition of the GOP structure in the LDP configuration file
:
Insert image description here

Insert image description here

The meaning of each column is explained below:

Type : Indicates slice type, which can be I, P, or B.

POC : Playback order, the value is 1 to GOPSize.

QPOffset : The final QP of the current frame is equal to QPOffset+QP.

QPOffsetModelOff : The offset value of the linear model, used to adjust the final QP, QPOffset+QP.

QPOffsetModelScale : The scaling value of the linear model, used to adjust the final QP, QPOffset+QP.

CbQPOffset : QP offset of Cb.

CrQPOffset:Cr的QP offset。

QPFactor : The weights used in RDO. A larger value means a smaller bitrate and lower quality. Typical values ​​are between 0.3 and 1.

tcOffsetDiv2 : The final tc_offset_div2 = LoopFilterTcOffset_div2 + tcOffsetDiv2 in deblocking filtering . The final value of tc_offset_div2 is an integer between -6 and 6.

betaOffsetDiv2 : The final beta_offset_div2 = LoopFilterBetaOffset_div2+ betaOffsetDiv2 in deblocking filtering . The final value of tc_offset_div2 is an integer between -6 and 6.

temporal_id : Indicates the temporal layer of the frame. Frames in the lower temporal layer cannot refer to frames in the higher temporal layer. If the reference image of a certain frame has a domain layer larger than its own frame, it will not be used in the reference, but it will be retained and may be used in future frames.

num_ref_pics_active : The size of L0 and L1, indicating how many reference pictures are used in each direction in encoding.

num_ref_pics : The number of reference images saved in the current frame. Includes reference images used by the current frame and those that will be used by future frames.

reference_pictures : Contains num_ref_pics integers, separated by spaces. Represents the POC of the reference image relative to the current image. The list is ordered, starting with negative numbers from largest to smallest, then positive numbers from smallest to largest, for example (-1 -3 -5 1 3). Note that images not included in this list are discarded, i.e. future frames cannot reference images that are not in this list.

predict : Defines the value of the syntax element inter_ref_pic_set_prediction_flag. 0 indicates that inter-frame RPS prediction is not applicable during RPS encoding, and deltaRIdx-1, deltaRPS, num_ref_idcs and Reference_idcs parameters are ignored and do not need to be transmitted. 1 indicates that inter-frame RPS prediction is used during RPS encoding and deltaRIdx-1, deltaRPS, num_ref_idcs and Reference_idcs parameters are used. 2 means that inter-frame RPS prediction is used when encoding RPS and only the deltaRIdx-1 parameter is used. The deltaRPS, num_ref_idcs and Reference_idcs parameters encoder can be automatically derived from the POC and refPic values, and deltaRIdx-1 points to RPS.

deltaRIdx -1: Indicates the difference between the current RPS and the predicted RPS-1 index.

deltaRPS : The difference between the POC of the predicted RPS and the POC of the current RPS.

num_ref_idcs : The number of ref_idcs that the current RPS needs to encode, and its value is equal to the num_ref_pics+1 of the predicted RPS.

reference_idcs : A space-separated list containing num_ref_idcs integers, indicating the reference index for inter-frame RPS prediction. The values ​​can be 0, 1 and 2, which respectively indicate that the reference image is the reference image of the current image and the reference image of future images, and is not used as a reference. The first num_ref_pics values ​​represent the reference images in the predicted RPS, and the last value represents the predicted image itself.
Insert image description here
Taking the above figure as an example, each GOP contains 4 frames of images, arranged in decoding order. Frame1 corresponds to the image with POC=4, which refers to image 0, so the reference image is -4. Frame2 corresponds to the image of POC=2, which refers to images 0 and 4, so the reference images are -2, 2. Frame3 corresponds to the image of POC=1. It is quite special. Although it only refers to images 0 and 2, it also needs to save image 4 for future image reference, so the reference images are -1, 1, 3. Frame4 corresponds to the image of POC=3. It refers to images 2 and 4, so the reference images are -1, 1.

Frame2, Frame3, and Frame4 can use inter-frame RPS prediction, so the predict parameter of these frames is set to 1. Frame2 uses Frame1 as the predicted value so deltaRIdx-1=0. In the same way, Frame3 and Frame4 use Frame2 and Frame3 as prediction values ​​respectively, deltaRIdx-1=0. deltaRPS is equal to the POC of the predicted frame minus the POC of the current frame, so the deltaRPS of Frame2=4-2=2, the deltaRPS of Frame3=2-1=1, and the deltaRPS of Frame4=1-3=-2.

The reference image POC of Frame2 is 0 and 4, so the corresponding reference_idcs is 1, 1. The first 1 indicates that the -4 in the reference RPS (RPS of Frame1) is still used as a reference, and the second 1 indicates that Frame1 itself is also used as a reference. .

The reference_idcs of Frame3 are 1, 1, 1. The first and second 1 indicate -2 and 2 in the reference RPS (RPS of Frame2) are still used as a reference, and the last 1 indicates that Frame2 itself is also used as a reference.

The reference_idcs of Frame4 is 0, 1, 1, 0. The first 0 indicates that the -1 in the reference RPS (RPS of Frame3) is not used as a reference in Frame4, and the next two 1s indicate the reference RPS (RPS of Frame3). The 1, 3 are still used as reference, and the last 0 means that Frame3 itself is not used as a reference.

The parameters corresponding to this GOP are as follows:
Insert image description here
The frame used as a reference improves the quality by using a smaller QPOffset. At the same time, the non-reference frame time domain layer is set to a larger value.

FrameK's deltaRIdx-1, deltaRPS, num_ref_idcs and Reference_idcs parameters can be generated from its POC and FrameM's POC, num_ref_idcs, reference_pictures values. K represents the index of the RPS to be encoded, and M is the index of the reference RPS. The generation process is as follows:
Insert image description here
The generation process is integrated into the encoder and needs to be enabled by setting predict=2.

  >>> 音视频开发 视频教程: https://ke.qq.com/course/3202131?flowToken=1031864 
  >>> 音视频开发学习资料、教学视频,免费分享有需要的可以自行添加学习交流群: 739729163  领取

Guess you like

Origin blog.csdn.net/weixin_52622200/article/details/131484342