In multimedia, video is basically crunching numbers and get pixels or crunching pixels and getting numbers. Most of the operation are quite time consuming on a general purpose CPU and orders of magnitude faster if done using DSP or hardware designed for that purpose.
Most of the commonly used system have video decoding and encoding capabilities either embedded in the GPU or in separated hardware. Leveraging it spares lots of cpu cycles and lots of battery if we are thinking about mobile.
The usually specialized hardware has the issue of being inflexible and that does clash with the fact most codec evolve quite quickly with additional profiles to extend its capabilities, support different color spaces, use additional encoding strategies and such. Software decoders and encoders are still needed and need badly.
Hardware acceleration support in Libav
The hardware acceleration support in Libav grew (like other eldritch-horror tentacular code we have lurking from our dark past) without much direction addressing short term problems and not really documenting how to use it.
As result all the people that dared to use it had to guess, usually used internal symbols that they wouldn’t have to use and all in all had to spend lots of time and
had enough grief when such internals changed.
Every backend required a quite large deal of boilerplate code to initialize the backend-specific context and to render the hardware surface wrapped in the AVFrame.
The Libav backend interface was quite vague in itself, requiring to override
get_buffer in some ways.
Overall to get the whole thing working the library user was supposed to do about 75% of the work. Not really nice considering people uses libraries to abstract complexity and avoid repetition…
As that support was written with just slice-based decoder in mind, it expects that all the backend would require the software decoder to parse the bitstream, prepare slices of the frame and feed the backend with them.
Sadly new backends appeared and they take directly either bitstream or full frames, the approach had been just to take the slice, add back the bitstream markers the backend library expects and be done with that.
Initial HWAccel 2 discussion
Last year since the number of backends I wanted to support were all bitstream-oriented and not fitting the mode at all I started thinking about it and the topic got discussed a bit during VDD 13. Some people that spent their dear time getting hwaccel1 working with their software were quite wary of radical changes so a path of incremental improvements got more or less put down.
- default functions to allocate and free the backend context and make the struct to interface between Libav and the backend extensible without causing breakage.
- avconv now can use some hwaccel, providing at least an example on how to use them and a mean to test without having to gut VLC or mpv to experiment.
- document better the old-style hwaccels so at least some mistakes could be avoided (and some code that happen to work by sheer look won’t break once the faulty assuptions cease to exist)
The new VDA backend and the update VDPAU backend are examples of it.
- extend the callback system to fit decently bitstream oriented backends.
- provide an example of backend directly providing normal AVFrames.
The Intel QSV backend is used as a testbed for hwaccel 1.3.
The future of HWAccel2
Another year, another meeting. We sat down again to figure out how to get further closer to the end result of not having the casual users write boilerplate code to use hwaccel to get at least some performance boost and yet let the power users have the full access to the underpinnings so they can get most of it without having to write everything from scratch.
Simplified usage, hopefully really simple
The user just needs to use AVOption to set specific keys such as hwaccel and optionally hwaccel-device and the library will take care of everything. The frames returned by
avcodec_decode_video2 will contain normal system memory and commonly used pixel formats. No further special code will be needed.
Advanced usage, now properly abstracted
All the default initialization, memory/surface allocation and such will remain overridable, with the difference that an additional callback called
get_hw_surface will be introduced to separate completely the hwaccel path from the software path and specific functions to hand over the ownership of backend contexts and surfaces will be provided.
The software fallback won’t be anymore automagic in this case, but a specific
AVERROR_INPUT_CHANGED will be returned so would be cleaner for the user reset the decoder without losing the display that maybe was sharing the same context. This leads the way to a simpler mean to support multiple hwaccel backends and fall back from one to the other to eventually the software decoding.
We try our best to help people move to the new APIs.
Moving from HWAccel1 to HWAccel2 in general would result in less lines of code in the application, the people wanting to keep their callback need to just set them after
avcodec_open2 and move the pixel specific
get_hw_surface. The presence of
av_hwaccel_hand_over_context will make much simpler managing the backend specific resources.
Expected Time of Arrival
Right now the review is on the HWaccel1.3, I hope to complete this step and add few new backends to test how good/bad that API is before adding the other steps. Probably HWAccel2 will take at least other 6 months.
Help in form of code or just moral support is always welcome!