libav

Security fun – what’s security?

Since I eventually had access to a batch of broken samples from Google, I spent the past months volunteering time to fix in Libav the issues uncovered (the whole set is over 3000 samples), you probably noticed by the number of releases.

You can consider “security” issues pretty much any kind of bug:

A segfault is a security issue.

A read/write from not allocated memory is a security issue.

An assert triggered IS a security issue and not a way to fix them.

A memory leak is a security issue and in most cases the worst kind.

Your security concern is not the same as mine!

Libav has a large surface to attack since you have decoders for every kind of multimedia format, it is a library used in many different situations, what’s a security concern for somebody is a nuisance for somebody else.

If VLC breaks on you when you are trying to decode some incomplete movie you got from bittorrent because one 0 or 1 got misinterpreted is not such an issue. If your transcoding pipeline gets stalled due the same movie being uploaded on Youtube, somebody might be screaming at the idiot that forgot to bound-check that array deep into the code.

If some buffer overflow could lead to code execution, most of the people using avconv to mass transcode won’t care that much, the process is fully sandboxed and they expect it, the people making players are mostly afraid of some buffer overflow being exploitable, their users would feel the pain.

So for us, Libav developers, there isn’t a bug more important or least important. We have to fix all of them and possibly fix them correctly once (so if you move from a buffer overflow to an assert, you just trade a possible code execution to a deny of service). That takes time and resources.

The source of all pain

Most of the bugs are naive assumptions and overlooks piling up over the years, the most common are the following

Off by one: You loop over something and you read one element too many
Corner cases: What happens when your frame has dimension 0? What if it is as large as the maximum representable value?
Faulty assumption: If you think that a malloc cannot fail, think again, if you think realloc won’t ever return NULL so you
can forget about the old pointer and just overwrite it, please DO think again. It can happen, even on Linux
Sloppy coding practices: Some bad practices tend to stick and bad patterns such as not forwarding return values will lead to problems later, usually making the process of tracking back to the root issue HARD.

Even if you are writing something non critical such a fire and forget commandline app you should be a little careful, if you plan to write something more involving such a library that could be used in MANY ways by LOTS of people, you MUST be careful.

Tools of the trade

Tracking bugs is usually annoying and time consuming, if they are crash they are at least apparent, memory leaks and faulty read/write may not trigger an apparent crash, making the whole thing more daunting. Luckily there are good tools help you.

Valgrind

The whole toolset is really valuable, massif and memcheck are the best to figure out where the memory went and who’s the fault.

AddressSanitizer

Asan is a boon since it is much faster than memcheck but also a pain since you have to instrument your code by using a certain compiler (clang or gcc-4.8 and later) and certain flags (-fsanitize=address). You can leverage it in gdb so you can inspect memory while debugging. That had been an huge timesaver most of the time. You can in theory do that also on memcheck adding some lines of code, probably I’ll provide snippets later.

drmemory

If your problem is on non-linux and non-mac you cannot use Asan and Valgrind, the new and coming tool to save you is drmemory. It is the youngest of the set and you can see how green it is by the lack of best practices… So no source releases, naive build system and bad version control system. If you try to build it is better to use the latest svn and hope.

Yet if you have to figure out what’s wrong on windows it is a huge boon already. People with time and will could try to help them on fixing their build system and convince them to move to git.

Automation

Never, ever, ever start hunting this kind of bugs w/out automating the most. Currently I have written a consistent number of lines of bash to automatically triage and check the samples, get the code to build in at least 2-3 flavours (clang and gcc with asan, vanilla gcc for valgrind) and eventually generate additional fate targets so I can run make fate-sec -C .gcc-asan and see if something that was fixed broke when we hadn’t look.

In closing

I still have 200 samples to fix and hopefully I’ll rally more people in helping, if you aren’t running routine tests and make sure your projects are at least valgrind clean (the easiest check to do), you should.

If you are writing code that is a little more critical, better if you use all the tools I briefly described and fix what you overlooked.

The case of defaults (Libav vs FFmpeg)

I tried not to get into this discussion, mostly because it will degenerate to a mud sliding contest.

Alexis did not take well the fact that Tomáš changed the default provider for libavcodec and related libraries.

Before we start, one point:

I am as biased as Alexis, as we are both involved on the projects themselves. The same goes for Diego, but does not apply to Tomáš, he is just a downstream by transition (libreoffice uses gstreamer that uses *only* Libav).

Now the question at hand: which should be the default? FFmpeg or Libav?

How to decide?

– Libav has a strict review policy every patch goes through a review and has to be polished enough before landing the tree.

– FFmpeg merges daily what had been done in Libav and has a more lax approach on what goes in the tree and how.

– Libav has fate running on most architectures, many of those are running Gentoo, usually real hardware.

– FFmpeg has an old fate with less architectures, many qemu emulations.

– Libav defines the API

– FFmpeg follows adding bits here and there to “diversify”

– Libav has a major release per season, minor releases when needed

– FFmpeg releases a lot touting a lot of *Security*Fixes* (usually old code from the ancient times eventually fixed)

– Libav does care about crashes and fixes them, but does not claim every crash is a Security issue.

– FFmpeg goes by leaps to add MORE features, no matter what (including picking wip branches from my personal github and merging them before they are ready…)

– Libav is more careful, thus having less fringe features and focusing more polishing before landing new stuff.

So if you are a downstream you can pick what you want, but if you want something working everywhere you should target Libav.

If you are missing a feature from Libav that is in FFmpeg, feel free to point me to it and I’ll try my best to get it to you.

Probably you already know that my side of FFmpeg got forced to rename itself to Libav. Some people is still wondering why we did that, you might read some short and longer summaries, have a look at our git or our mailing lists to see how we are faring and where we are heading.

So far I’m quite happy with what we are achieving little by little and day by day: a shared and quite defined plan for the future of the library, releases being a first class citizen, long standing issues being tackled and solved.

We were sorely lacking in the communication side and now we are trying to improve there as well. (This blog post and the website work is just part of it)

In the Gentoo land Scarabeus helped me adding libav, now it is pending some migration work to have all the software working with both libav and ffmpeg.

I hope you’ll be pleased by the outcome (people longing for the multithread work being fully merged I think are).

Theora – the benchmark

There are many discussions about how Theora should be used and about how it smokes x264 somehow.

I do not believe it or at least I don’t believe the proofs till I try myself.

Any of the Theora zealots reading could please provide a reproducible benchmark so everybody could see for themselves how good/bad Theora is?

A script that fetches the new theora encoder, ffmpeg, takes an original, produces two videos using theora and h264 (no audio), same bitrate for both and in the end outputs cpu and memory usage would be great.

LinuxTag – day after

I’m eventually back home, I’m dead tired, the c-base party was great in many ways (people, food, place) and ending the night (actually starting the morning) playing Go with beer and music was _quite_ fun (thanks again for the games =))

I’ll try to wrap up everything in a short post before falling asleep completely: the LinuxTag had been a wonderful experience I had been more there as FFmpeg developer and less as Gentoo developer (mostly because I had to man the FFmpeg booth mostly since we aren’t that many and that I failed to chat in a proficuous with the gentoo people even if we spent the evening in the same place most of the time =| In the end I had a refreshing conversation about Gentoo with rbu luckily and I managed to chat a bit more with fauli just before he was leaving…)

Was quite fun going at the end of the event to the fsfe stand to do explain the FFmpeg stance about patents, Theora (more will follow) and why, in our humble opinion at least, isn’t correct to propose^Wactually shove down to the web users throat such codec just because of some claims that are yet to be validated…

The discussion was quite pleasant mostly because to my surprise fsfe people there weren’t zealots, so the whole discussion discussion even evolved to touch more interesting topics, like reverse engineering, making sure our license is respected and actually multimedia, with a brief discussion on containers, codec and streaming (that part actually started from an explanation why Theora isn’t that perfect fruit of opensource that is claimed and why Ogg has many
shortcomings as container and why in multimedia you do not have one-size-fit-all solutions… )

LinuxTag – day 1

After about 4 years I eventually managed to get there! Today is the first day and I’m actually sort of manning the FFmpeg Booth and from time to time I could happen to be in the Gentoo one as well.

In the FFmpeg stand we are showing BBB high res in a big LCD screen from a small beagleboard. The operating system image is obviously Gentoo as well the other system present showing some jumpy Japanese idol video (not my idea).

See you! (Pictures will come later)

Alive again

During August I was supposed to prepare something new for the LScube project.
I was supposed to do a bit more for the Summer of Code (Sorry Keiji and Seraphim) and the FFmpeg project (libswscale isn’t yet in shape I know).

Sadly I spent about a month with a bad pneumonia and just now I’m starting to catch up really. (I hoped to stay better before but, as my 2 miss on council meetings shown, wasn’t ok yet)

Diego is doing a wonderful job rationalizing LScube suite, Ale almost finished his Flux (started as a way to show me that C++ isn’t the problem, just people misusing it), Michael decided to make libswscale the default in FFmpeg forcing more people taking care about it.

What now?

Well first I’ll try to not miss new council meetings (and I’ll ask Chrissy and Diego to back me up when happens I cannot be there).

I have some pending stuff about libswscale that will end up having a better structure to provide arch specific optimizations, once they hit I’ll try to do again a witch hunt over ffmpeg using applications to provide the new libavcodec in portage.

Hopefully the next feng (and friends) release will hit portage and the stale fenice and libnemesi will be replaced accordingly ^^;

There is plenty of new stuff on the CELL land to be tried and provided (like mars)

I’ll try to take some time to merge the openmoko patches to the new (and gcc-4 building) qemu so I won’t have to provide an qemu-openmoko to help people interested in getting Gentoo on the freerunner (openrc could help a bit the boot time possibly and bitbake is a bit annoying from time to time)

Last but not least Syoyo is getting his blazing fast renderer Lucille to be usable even by common people.

Lots is happening (with and without me) =)

cmake and openwengo, what a match…

Quick and easy task for this day: check which is the status of openwengo since seems that’s the only client providing what skype does, at least in theory…

Part one, get the source – delivered from the site as a nifty .zip

– wget + unzip worked as should

Part two, build the beast.

– cmake! The worst build system since imake, sadly brought into the fashionable stuff because of KDE4,
Now we get something interesting, on a phenom you get:

qlop -tH cmake
cmake: 5 minutes, 12 seconds for 1 merges

A comparison:
perl: 3 minutes, 43 seconds for 1 merges
m4: 29 seconds for 1 merges
automake: 14 seconds for 3 merges
autoconf: 16 seconds for 2 merges
libtool: 44 seconds for 1 merges
make: 30 seconds for 1 merges
cmake: 5 minutes, 12 seconds for 1 merges

Using something less new makes the whole thing even more interesting. Let’s say that to build something using cmake I need the 3/2 the time to build perl, just to start.

gentoo has an ebuild for it so it it’s just getting it, (why cmake needs xmlrpc-c is a question I’ll left unanswered…)

Now back to openwengo.

I have to create a build dir and run cmake from there, it obviously finds something wrong with ffmpeg even if supposedly it has its stale internal version (remind: ffmpeg changed the include paths some time ago) as fallback (no, you have to run cmake a first time, have it fail, run ccmake, find the right option and turn it on, start again).

I’m not sure if this brain damage is due wengo people or cmake, still I find this annoying. The *oh* so despised configure from autotools can be smarter and does not require that many lines if you know what you are doing…

It still doesn’t take in account that there are different endianess for linux (autotools have a nice builtin for this…)…

Once the thing built (took a while), I just got more or less the same thing I got last time I tried it, looks like the 2.2 isn’t quite up to date and they are working on the new mayor release =/

Back looking for other alternatives for skype =_=

What news?

Lately I started to test some stuff, after hammering openrc inside bsd jails I started to use it on my laptop, the ps3 and the new phenom I bought recently. So far I hadn’t anything to complain but just minor glitches that got promptly addressed by Roy. With the help of the other testers and developer you’ll get baselayout 2 soon and in a pretty good shape.

Feng and felix got quite an overhaul and hopefully I’ll release them once I cleaned up the code:
– felix had lots of experimental stuff added and you may not want it nor look at it right now.
– feng is getting the eventloop reshaped so that you can understand what’s doing w/out losing your sanity.

I’m still waiting for more SoC applications (in particular people willing to learn CELL seems fewer than we’d like) you have this week left to apply!

Summer of Code!

I’m again mentor, again for the same organizations (Gentoo and FFmpeg), with quite yet another batch of strange ideas and demanding tasks for you.

FFmpeg
With Benjamin Larsson and possibly Kristian Jerpetj