Process Description FFmpeg encapsulation, decapsulation and decoding

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/myvest/article/details/89254452

background

In the course of their work, because the player of our project is based on FFmpeg, so often also involves the use of FFmpeg, expanding and tuning. But never take this module summarizes some documents, so think first FFmpeg basic API call flow simple description for reference.
Described herein decapsulation -> call decoding process, and a process of format conversion encapsulation (i.e., decapsulation -> repackaging), to reach a basic understanding of FFmpeg object API calls.
Source FFmpeg project has a lot of example, we can refer to the study.

Process Description

Modules need to use:
. 1, with libavformat: parsing and used to generate various audio and video formats package, including obtaining information required for decoding to generate a decoded context structure function reads the audio and video frames and the like, and comprising demuxers muxer libraries;
2, libavcodec: various types audio / video codec;
. 3, libavutil: contains some common utility functions;
whether decapsulation or encapsulation decoding preparation stage, are required to first use av_register_all (); Demuxer associated interface / mux and codec registered in.
If it comes to network-related operations, but also need to call avformat_network_init (); registration number of things related to the network protocol.

1, decapsulates

The following steps:
1, the relevant registration module (av_register_all; avformat_network_init)
2, open the file, acquiring encapsulation information context AVFormatContext (avformat_open_input)
. 3, obtaining media files audio and video information, this step will fill AVFormatContext internal variables (avformat_find_stream_info)
. 4, the sound acquisition video stream ID. There are two general methods: 1) All traverses internal AVFormatContext stream, if the stream is codec_type corresponding to audio / video, which records the current stream ID; 2) FFmpeg provides av_find_best_stream interface, directly corresponding type (audio or video) of stream ID
. 5, each frame of data acquisition stream (by av_read_frame)
. 6, close the file

2, decoding

Decoding on the basis of decapsulation, each frame of data is decoded. The following steps:
1, the decoder application context AVCodecContext (avcodec_alloc_context3)
2, AVCodecContext initialization parameter, the decoder can use the parameters obtained decapsulated disposed incoming stream (avcodec_parameters_to_context)
. 3, the decoder opens (avcodec_open2)
. 4, each frame decoded data. There are two methods: 1) Old FFmpeg interface, audio and video decoders require different decoding call interfaces, such as audio avcodec_decode_audio4, video avcodec_decode_video (); 2) FFmpeg new interface, simply acquired decapsulation frame is transmitted to the decoder (avcodec_send_packet), then receiving can (avcodec_receive_frame)
. 5, close the file and the decoder

3, an example of decoding process 1. decapsulation

Examples of operation involves only the audio, video operation flow is also the same. I found a different left and right channels of MP3 music files, code de-encapsulated and de-encapsulated PCM data, and left and right channel data are saved to two files, so you can clearly hear two different music.
Of course, according to this code is simply roughening solution above example code decoding process is completed package, the lack of the necessary air flow proper judgment of wrong judgment.

/**
 * 最简单的基于FFmpeg的解码器测试代码
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <android/log.h> 
#include <unistd.h>

#define LOGI(...)  __android_log_print(ANDROID_LOG_INFO, LOG_TAG, __VA_ARGS__)
#define LOGE(...)  __android_log_print(ANDROID_LOG_ERROR, LOG_TAG, __VA_ARGS__)
#define LOG_TAG "FFmpeg-test"


#ifdef __cplusplus
extern "C"
{
#endif
#define __STDC_CONSTANT_MACROS
#ifdef _STDINT_H
#undef _STDINT_H
#endif
#include <stdint.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
#ifdef __cplusplus
};
#endif


int main(int argc, char* argv[]){
    //init
    av_register_all();
    avformat_network_init();
    //open
    AVFormatContext *mFormatContext = NULL;
    avformat_open_input(&mFormatContext, argv[1],NULL,NULL);
    //get info
    avformat_find_stream_info(mFormatContext,NULL);
    int audio_idx = av_find_best_stream(mFormatContext,AVMEDIA_TYPE_AUDIO,-1,-1,NULL,0);
    LOGI("[OPEN] file =%s [AUDIO:%d]\n",argv[1],audio_idx);
    FILE *audio_dst_file1 = fopen("/data/1.pcm", "wb");
    FILE *audio_dst_file2 = fopen("/data/2.pcm", "wb");
    //get decode
    AVCodec *pAudioCodec = avcodec_find_decoder(mFormatContext->streams[audio_idx]->codecpar->codec_id);
    AVCodecContext *pACodecCxt = avcodec_alloc_context3(pAudioCodec);
    avcodec_parameters_to_context(pACodecCxt, mFormatContext->streams[audio_idx]->codecpar);
    avcodec_open2(pACodecCxt,pAudioCodec,NULL);

#if 0    //use old mode
    //demux+decode
    AVPacket *mPacket = av_packet_alloc();
    AVFrame *mFrame = av_frame_alloc();
    int got = -1;
    size_t size = -1;
    int i = 0;
    while(av_read_frame(mFormatContext,mPacket) == 0){//demux
         if(mPacket->stream_index == audio_idx){
            LOGI("[AUDIO] size =%d pts=%lld flag=%d\n",mPacket->size,mPacket->pts,mPacket->flags);
            got = 0;
            avcodec_decode_audio4(pACodecCxt,mFrame,&got,mPacket);//decode
            if(got > 0){
                AVSampleFormat sf = AVSampleFormat(mFrame->format);
                size = mFrame->nb_samples * av_get_bytes_per_sample(sf);
                LOGI("[AUDIO] got one frame size[%d]  format[%d] \n",size, mFrame->format);
                if(av_sample_fmt_is_planar(sf)){
                    if(mFrame->extended_data[0] != NULL){
                        fwrite(mFrame->extended_data[0], 1, size, audio_dst_file1);
                    }
                    if(mFrame->extended_data[1] != NULL){
                        fwrite(mFrame->extended_data[1], 1, size, audio_dst_file2);
                    }
                }else{
                    if((mFrame->data[0] + i) != NULL){
                        if(i%2 == 0) fwrite((mFrame->data[0] + i), 1, size, audio_dst_file1);  
                        else fwrite((mFrame->data[0] + i), 1, size, audio_dst_file2); 
                    }
                }
            }
            av_frame_unref(mFrame);
        }
        av_packet_unref(mPacket);
    }
#else //use new mode
    //demux+decode
    AVPacket *mPacket = av_packet_alloc();
    AVFrame *mFrame = av_frame_alloc();
    int ret = -1;
    size_t size = -1;
    int i = 0;
    while(av_read_frame(mFormatContext,mPacket) == 0){//demux
         if(mPacket->stream_index == audio_idx){
            LOGI("[AUDIO] size =%d pts=%lld flag=%d\n",mPacket->size,mPacket->pts,mPacket->flags);
            ret = avcodec_send_packet(pACodecCxt, mPacket);
            if(ret == 0){
                
                while(avcodec_receive_frame(pACodecCxt,mFrame) == 0){//decode
                    AVSampleFormat sf = AVSampleFormat(mFrame->format);
                    size = mFrame->nb_samples * av_get_bytes_per_sample(sf);
                    LOGI("[AUDIO] got one frame size[%d]  format[%d] \n",size, mFrame->format);
                    if(av_sample_fmt_is_planar(sf)){
                        if(mFrame->extended_data[0] != NULL){
                            fwrite(mFrame->extended_data[0], 1, size, audio_dst_file1);
                        }
                        if(mFrame->extended_data[1] != NULL){
                            fwrite(mFrame->extended_data[1], 1, size, audio_dst_file2);
                        }
                    }else{
                        if((mFrame->data[0] + i) != NULL){
                            if(i%2 == 0) fwrite((mFrame->data[0] + i), 1, size, audio_dst_file1);  
                            else fwrite((mFrame->data[0] + i), 1, size, audio_dst_file2); 
                        }
                    }
                    
                    av_frame_unref(mFrame);
                }
                
            }
        }
        av_packet_unref(mPacket);
    }

#endif
    //close
    av_packet_free(&mPacket);
    av_frame_free(&mFrame);
    avcodec_close(pACodecCxt);
    avformat_close_input(&mFormatContext);
    fclose(audio_dst_file1);
    fclose(audio_dst_file2);
    
    return 0;
}

4, the package

Steps are as follows:
1, register the relevant module (av_register_all; avformat_network_init)
2, the file name to be outputted, and acquires package information context the AVFormatContext (avformat_alloc_output_context2)
. 3, open the output file the IO (avio_open)
. 4, add video and audio streaming (avformat_new_stream)
. 5, package file header information (avformat_write_header)
. 6, to a file write packet, if the packet contains a plurality of streams of video, audio, etc., in accordance with the size of the cross-write timestamp (av_interleaved_write_frame)
. 7, the end of the package information file (av_write_trailer )
8, the closing operation

5, 2. Examples of the package transfer process

Conversion package, and the package is about to de-encapsulation processes used in combination, of course, some of the time stamp, when reseal need to convert it.
The package code transferred according to the process steps done, of course, lack the necessary air flow proper judgment of a wrong judgment, used only as a reference process.

/**
 * 最简单的基于FFmpeg的格式转换(重新封装)
 */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#ifdef __cplusplus
extern "C"
{
#endif
#define __STDC_CONSTANT_MACROS
#ifdef _STDINT_H
#undef _STDINT_H
#endif
#include <stdint.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#ifdef __cplusplus
};
#endif

int main(int argc, char **argv){
    if(argc < 3){
        return -1;
    }
    const char* in_file = argv[1];
    const char* out_file = argv[2];
    AVFormatContext* in_ctx=NULL;
    AVFormatContext* out_ctx=NULL;

    av_register_all();
    
    avformat_open_input(&in_ctx,in_file,NULL,NULL);//input
    avformat_find_stream_info(in_ctx,NULL);
    avformat_alloc_output_context2(&out_ctx,NULL,NULL,out_file);//output
    av_dump_format(in_ctx, 0, in_file, 0);
    
    int aidx = -1,vidx = -1;
    int i = 0;
    int out_aidx = -1,out_vidx = -1;

    //get audio stream
    aidx = av_find_best_stream(in_ctx,AVMEDIA_TYPE_AUDIO,-1,-1,NULL,0);
    if(aidx >= 0){
        AVStream *st = avformat_new_stream(out_ctx,NULL);//add stream
        avcodec_parameters_copy(st->codecpar,in_ctx->streams[aidx]->codecpar);
        st->codecpar->codec_tag = 0;
        out_aidx = i;
        i++;
    }
    //get video stream
    vidx = av_find_best_stream(in_ctx,AVMEDIA_TYPE_VIDEO,-1,-1,NULL,0);
    if(vidx >= 0){
        AVStream *st = avformat_new_stream(out_ctx,NULL);//add stream
        avcodec_parameters_copy(st->codecpar,in_ctx->streams[vidx]->codecpar);
        st->codecpar->codec_tag = 0;
        out_vidx = i;
    }
    
    printf("in_aidx[%d] out_aidx[%d]; in_vidx[%d] out_vidx[%d]\n",aidx,out_aidx,vidx,out_vidx);
    av_dump_format(out_ctx, 0, out_file, 1);
    
    avio_open(&out_ctx->pb, out_file, AVIO_FLAG_WRITE);
    
    AVPacket* pkt =av_packet_alloc();
    AVStream *in_stream, *out_stream;
    
    avformat_write_header(out_ctx,NULL);
    while(av_read_frame(in_ctx,pkt) == 0){//DEMUX
        if(pkt->stream_index == aidx){
            in_stream  = in_ctx->streams[aidx];
            out_stream = out_ctx->streams[out_aidx];
        }
        else if(pkt->stream_index == vidx){
            in_stream  = in_ctx->streams[vidx];
            out_stream = out_ctx->streams[out_vidx];
        }else{
            printf("not aidx nor vidx!!!\n");
            av_packet_unref(pkt);
            continue;
        }
        
         //copy packet ???
        pkt->pts = av_rescale_q_rnd(pkt->pts, in_stream->time_base, out_stream->time_base, (AVRounding)(AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
        pkt->dts = av_rescale_q_rnd(pkt->dts, in_stream->time_base, out_stream->time_base, (AVRounding)(AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX));
        pkt->duration = av_rescale_q(pkt->duration, in_stream->time_base, out_stream->time_base);
        pkt->pos = -1;

        if(av_interleaved_write_frame(out_ctx, pkt) < 0){
            printf("av_interleaved_write_frame fail\n");
            //break;
        }
        av_packet_unref(pkt);
    }
    av_write_trailer(out_ctx);

    //close
    av_packet_free(&pkt);
    avformat_close_input(&in_ctx);
    avformat_free_context(out_ctx);

    return 0;
}

Guess you like

Origin blog.csdn.net/myvest/article/details/89254452