This mini-post spurred from this bug.
AVFrame and AVCodecContext
In Libav there are a number of patterns shared across most of the components.
Does not matter if it models a codec, a demuxer or a resampler: You interact with it using a
Context and you get data in or out of the module using some kind of
Abstraction that wraps data and useful information such as the timestamp. Today’s post is about AVFrames and AVCodecContext.
The most used abstraction in Libav by far is the
AVFrame. It wraps some kind of
raw data that can be produced by decoders and fed to encoders, passed through filters, scalers and resamplers.
It is quite flexible and contains the data and all the information to understand it e.g.:
format: Used to describe either the pixel format for video and the sample format for audio.
height: The dimension of a video frame.
sample_ratefor audio frames.
This context contains all the information useful to describe a codec and to configure an encoder or a decoder (the generic, common features, there are private options for specific features).
Being shared with encoder, decoder and (until Anton’s plan to avoid it is deployed) container streams this context is fairly large and a good deal of its fields are a little confusing since they seem to replicate what is present in the AVFrame or because they aren’t marked as write-only since they might be read in few situation.
In the bug mentioned
channel_layout was the confusing one but also
height caused problems to people thinking the value of those fields in the
AVCodecContext would represent what is in the
AVFrame (then you’d wonder why you should have them in two different places…).
As a rule of thumb everything that is set in a context is either the starting configuration and bound to change in the future.
Video decoders can reconfigure themselves and output video frames with completely different geometries, audio decoders can report a completely different number of channels or variations in their layout and so on.
Some encoders are able to reconfigure on the fly as well, but usually with more strict constraints.
Why their information is not the same
The fields in the AVCodecContext are used internally and updated as needed by the decoder. The decoder can be multithreaded so the AVFrame you are getting from one of the
avcodec_decode_something() functions is not the last frame decoded.
Do not expect any of the fields with names similar to the ones provided by AVFrame to stay immutable or to match the values provided by the AVFrame.
Allocating video surfaces
Some quite common mistake is to use the
coded_height to allocate the surfaces to present the decoded frames.
As said the frame geometry can change mid-stream, so if you do that best case you have some lovely green surrounding your picture, worst case you have a bad crash.
I suggest to always check that the
AVFrame dimensions fit and be ready to reconfigure your video out when that happens.
If you are using a current version of Libav you have
avresample_convert_frame() doing most of the work for you, if you are not you need to check that
sample_rate do not change and manually reconfigure.
Similarly you can misconfigure swscale and you should check manually that
height and reconfigure as well. The AVScale draft API on provides an
Be extra careful, think twice and beware of the examples you might find on internet, they might work until they wont.