Demotivation, FUD and why I still contribute to Libav

Libav had been since its start a controversial project, mainly because lots of drama and a huge amount of manure had been thrown against it and even about 4 years after its start there are people spewing this kind of vitriolic comments.

What is Libav

How it started

Libav started when the trademark FFmpeg had been given by the owner of it Fabrice Bellard to the former FFmpeg leader Michael Niedermayer.

Michael Niedermayer managed to get demoted from his leader position by the topmost 18 people involved in FFmpeg by the time due his tendency of not following even the basic project rules. That after weeks from being voted to stay as leader by 15, 5 explicitly stating their vote was conditioned by his behavior and 1 definitely against him.

His demotion is due to acts in full disregard of the policies in place, even those enforced automatically by the svn hooks.

The fact he bullied and belittled volunteers and contributor can be checked by digging the mailing list during the months and the year before the management change and it also added up to the decision.

What aims to do

Being burnt by the unreliable leadership experience the Libav organization focused on rules both for development and for management.

Ideally having a clear set of rules and making every member and contributor abide to the very same rule would prevent abuses.

Rules in summary

  • All the code must use the same coding style
  • All the patches require a review.
    • Nothing hits the tree before a second pair of eyes read the code.
    • No part of the code is a private garden that only selected people modify on their whim.
  • Everybody must abide to a quite simple code of conduct.
    • rude behavior is not welcome.
    • flames are not welcome.
    • constructive criticism is needed.

Since FFmpeg was a project famous for horrid code quality and sketchy and irregular API and a good chunk of needed changes were vetoed by the now-demoted leader, Libav focused mainly in cleaning up the code and making the API easy to use. This lead to deep overhauls such as reference-counted frames to make the multi-threading much simpler, reference-counted packets to make extended workflow easy and a good chunk of bugs and ancient issues fixed.

Demotivation and FUD

The people involved in Libav mainly focused on code, ignoring most if not all the kind of wild statements fans of Michael Niedermayer dispersed around internet. The idea is that the code should speak by itself.

Did not work as expected. At all.

Apparently news outlets prefer to repeat whatever they find on a blogpost instead of checking the facts by reading the mailing list and the git. Many misconceptions, to use an euphemism, had been spun around and apparently won’t die that quickly if left unaccounted.

The thief theme

“The people behind Libav stole the FFmpeg infrastructure”, some of the admins even got questioned in their workplace if they really stole something. Pity that most of the people related in keeping the infrastructure functional were doing so on their own hardware and co-location, but obviously checking facts is hard, better go help shoveling manure on people.

Needless to say that gets hard keep spending time and resources and get this kind of treatment. At VDD 2014 there were some agreements on at least clear the air about this by having a strong statement about it. So far it did not happen.

The daily merge

“Libav is a non-functional fork of FFmpeg” is quite often repeated all around, Michael started to merge daily everything Libav does more or less since the beginning, making FFmpeg effectively a strict derivative of Libav.

Initially he even use a quite misleading moniker in his merge naming qatar the Libav tree. Now things are a little more fair and at least FFmpeg more or less states that is a derivative of Libav with additional features in their download page.

Few people ended up deciding to leave the Libav project since they did not appreciate the fact their hard work done on their spare time would be used by somebody else to disparage them and on top of it to ask for donation with wild claims that the “80% of all is done in FFmpeg is done by Michael”.

You are helping the evil

New people contributing to Libav quite often get some fans contacting them to enlighten the people about how evil is Libav and the people working on it and how much better FFmpeg is and how better would be to contribute to the real project.

This is a form of harassment, so far the people involved kept sort of quiet, but probably would help being more outspoken since most people around do not like to check the facts. Private emails probably should stay private, but blog post comments can be easily found and linked. Since from the beginning such “jokes” had been spun I let the reader imagine how much patience somebody should muster to keep staying quiet and just write code (this post is me getting fed up enough after 3+ years).

Motivation and Demotivation

I like to write code that is functional and doesn’t pokes you eyes when you read it after a while and I like to write code about multimedia among the other things.

I used to consider the split and the following competition had been a bliss, since the original FFmpeg project was pretty much dying due the kind of environment it had. Having a competitor tempers the bad tendencies quite well.

On the other hand I’m not cool at all in having an unfair competition, with a side piggy-backing on the other like it is happening: everything in Libav is merged inf FFmpeg, enjoying the fact the code is polished and cleaned before. Libav has the above mentioned rules so any patch has to be discussed and usually that leads to a fair amount of changes when what is cherry-picked from FFmpeg hits the mailing list for review a good deal of times is faster and simpler to just write fix covering more issues or rewrite from scratch the feature some user deemed interesting to have.

I probably won’t stop contributing to Libav since I like a lot working with the other people involved and the few people thanking me now and then make me think that giving up would just cause more harm than good in the big picture. I can understand those that decide to stop though, sometimes the amount of nonsense thrown at you by those rabid fans not knowing anything is appalling.

Hopefully writing more about it might help defusing this situation.

PS: You might also read the often-ignored initial remark from Kostya

Hardware acceleration in Libav

Multimedia formats require lots of computation power since they use fairly complex mathematical transformation. Usually most of them can be implemented efficiently in silicon requiring orders of magnitude less power to run while remaining quite fast to execute.

Hardware acceleration

Most of the current platforms, being them desktop, mobile or server, sports some kind of hardware unit to offload decoding and encoding of multimedia formats.

They usually are accessed through platform-specific API, sometimes the API is even codec-specific, making the whole implementation experience quite painful and a lot time consuming.

Depending on the specific hwaccel implementation, it could be bound to the gpu and use the gpu memory, thus requiring to manage non-system memory in specific ways adding additional burden for those that would just to have some quick gain while opening a world of interesting optimization possibilities such as zerocopy processing pipelines for transcoding or in-gpu pixel-format conversion, scaling and blending.

There are some generic wrappers such as vdpau, vaapi, dxva2 and vt that abstract some of the complexity related providing a more uniform interface, but usually there is a need of a proper (and possibly near transparent) fallback for the situations in which the hardware cannot really manage an advanced codec profile so just leveraging the generic abstractions solve just part of the problem but, as decoding goes, it provides a large performance boost while requiring some effort in managing the non-system memory.

For most of the users the learning curve is too steep for being really useful.

Libav and hwaccel

Hardware acceleration support happened to be implemented more or less around a number of specific implementation (with quite non-uniform approach, so vaapi had some hooks in the codecs, vdpau had full decoders until it had been ported to the same interface and made way nicer to use thanks to Rémi) and requiring quite a number of backend specific boilerplate code to set up the implementation specific context and then manage the opaque buffers the decoder outputs.

High level functionality

The hwaccel infrastructure is currently focused on the following items:
– the fallback from hardware to software should be as seamless as possible
– basic hardware decoders must be taken in account (e.g. for h264 some accept single NAL units and can’t parse the bitstream on their own)
– the user must have a mean to control the context setup and the full memory management

In order to do that the normal software decoder is used to parse the bitstream and depending on if the hwaccel is enabled or not, route the parsed data to the software or the hardware decoder and the output frames are then managed by the decoder frame reordering functionality if present.

This way falling back, even from a specific hwaccel to another, is sort of simple at least conceptually: every time a new extradata appears it is parsed, fed to the first hwaccel setup code if not supported optional fallback hwaccel can try and eventually the software decoder is picked.

The decoded frames, no matter if opaque hw-specific gpu-memory or normal system-memory go to the same codepath and the user has a mean to set the video rendering pipeline to take this in account.

Limit of the infrastructure

All the software evolves, since the minimalistic approach to the hardware acceleration requires a huge amount of boilerplate code and deep knowledge of the bitstream formats in use the new APIs for the new hardware tried to improve and be easier to manage for less savvy users.

API such mfx try to abstract completely and just require to get the input bitstream so the hardware input buffers can be filled and produce the frames, in presentation order, once the (parallel) decoding process yelds. It is in pull mode instead of push, so when more data is needed it gets requested and when a frame is ready it gets notified.

For an user most of the headaches related to frame management and elementary stream parsing are gone, or almost gone since some formats have multiple elementary stream representation and only a single one is supported…

For me (and then Anton), having an API like that poses multiple problems.
– having to feed a bitstream requires constructing it back from the software parsing and this is not terrible, it had been done for vda.
– the decoder wants to get the data only when the hardware buffers require more and that would require at least a queue.
– the frames outputted are already in presentation order, requiring to bypass the frame reordering logic.

Fitting a new-style hardware acceleration API in Libav

I tried the following approaches:
– Consider libmfx a normal third party decoder
– Anton complained about removing completely the ability to user hardware memory
– Implement additional hooks to keep the hwaccel interface but avoid the bitstream parsing and frame reordering.
– Anton and others complained about preventing the transparent fallback

Then I let Anton try for himself and in the end we agreed that providing an interim solution that let the user manage the memory and then trying to complete hwaccel2 on a second time would be the best.

More on the topic later 🙂

Bridging Markdown to sphinx

One of my annoying itch is documentation.

I like a lot sphinx as toolchain but the underlying rst has a quite steep learning curve and it is outright ugly to write in many common situation.

I like a lot kramdown as syntax but sadly it is ruby-only and overall the Markdown implementation for python usually have a good number of shortcomings, including the quite annoying part of not having a full AST for the extensions, making quite a pain to proper translations (e.g. moin-2 markdown can’t use the extension supported by the original since they get mangled badly during the process of node matching)…

Enters CommonMark

CommonMark is a cooperative effort to build actually a proper specification of the ubiquitous markdown syntax. The implementations usually provide a full AST and the python one (derived from the javascript one) is quite easy to understand, fast enough and easy to extend.

I know that somebody else already tried to bridge docutils and markdown, sadly parsley is a tad slow for the purpose. I gutted away the original markdown parser and wired in commonmark-py the result is a decently fast implementation that maps most of the core syntax to the docutils AST and thus makes possible to write in markdown and get it converted using the docutils output formats.

What’s left

ReStructuredText is much richer than CommonMark core, at least I should complete my work to support attributes so the manpage output would work mostly as intended.

The directive system is quite different from the one currently discussed and that will cause a good deal of headache to map the sphinx extension to document the function parameters.

As usual help welcome!