April 2014 – Luca Barbato

Everybody should remember than a 100% secure device is the one unplugged and put in a safe covered in concrete. There is always a trade-off on the impairment we inflict ourselves in order to stay safe.

Antonio Lioy

In the wake of the heartbleed bug. I’d like to return again on what we have to track problems and how they could improve.

The tools of the trade

Memory checkers

I wrote in many places regarding memory checkers, they are usually a boon and they catch a good deal of issues once coupled with good samples. I managed to fix a good number of issues in hevc just by using gcc-asan and running the normal tests and for vp9 took not much time to spot a couple of issues as well (the memory checkers aren’t perfect so they didn’t spot the faulty memcpy I introduced to simplify a loop).

If you maintain some software please do use valgrind, asan (now also available on gcc) and, if you are on windows, drmemory. They help you catch bugs early. Just beware that sometimes certain versions of clang-asan miscompile. Never blindly trust the tools.

Static analyzers

The static analyzers are a mixed bag, sometimes they spot glaring mistakes sometimes they just point at impossible conditions.
Please do not put asserts to make them happy, if they are right you just traded a faulty memory access for a deny of service.

Other checkers

There are plenty other good tools from the *san family one can use, ubsan is maybe the newest available in gcc and it does help. Valgrind has plenty as well and the upcoming drmemory has a good deal of interesting perks, if only upstream hadn’t been so particular with release process and build systems you’d have it in Gentoo since last year…

Regression tests

I guess everybody is getting sick of me talking about fuzzy testing or why I spent weeks to have a fast regression test archive called playground for Libav and I’m sure everybody in Gentoo is missing the tinderbox runs Diego used to run.
Having a good and comprehensive batch of checks to make sure new code and new fixes do not have the uncalled side effect of breaking stuff is nice, coupled with git bisect makes backporting to fix issues in release branches much easier.

Debuggers

We have gdb, that works quite well, and we have lldb that should improve a lot. And many extensions on top of them. When they fail we can always rely on printf, or not…

What’s missing

Speed

If security is just an acceptable impairment over performance in order not to crash, using the tools mentioned are an acceptable slow down on the development process in order not to spend much more time later tracking those issues.

The teams behind valgrind and *san are doing their best to just make the execution three-four times as slow when the code is instrumented.

The static analyzers are usually just 5 times as slow as a normal compiler run.

A serial regression test run could take ages and in parallel could make your system not able to do anything else.

Any speed up there is a boon. Bigger hardware and automation mitigates the problem.

Precision

While gdb is already good in getting you information out of gcc-compiled data apparently clang-compiled binaries are a bit harder. Using lldb is a subtle form of masochism right now for many reasons, it getting confused is just the icing of a cake of annoyance.

Integration

So far is a fair fight between valgrind and *san on which integrates better with the debuggers. I started using asan mostly because made introspecting memory as simple as calling a function from gdb. Valgrind has a richer interface but is a pain to use.

Reporting

Some tools are better than other in pointing out the issues. Clang is so far the best with gcc-4.9 coming closer. Most static analyzers are trying their best to deliver the big picture and the detail. gdb so far is incredibly better compared to lldb, but there are already some details in lldb output that gdb should copy.

Thanks

I’m closing this post thanking everybody involved in creating those useful, yet perfectible tools, all the people actually using them and reporting bugs back and everybody actually fixing the mentioned bugs so I don’t have to do myself alone =)

Everything is broken, but we are fixing most of it together.

In the past month or so I started helping Vittorio on adding one of the important missing feature to our h264 decoder. Multi View support.

MVC

The basic idea of this feature is quite simple, you are shooting a movie with multiple angles, something is bound to be sort of common and you’d like to ensure frame precision.

So what about encoding all the simultaneous frames captured in the same elementary stream, share across the different layers as much as you could and then let the decoder output the frames somehow?

Since we know that all the containers have problems might be not completely a bogus idea to have the codec taking care of it. Even better if the resulting aggregated bitstream is more compact than the sum of the single ones.

High level structure

What’s different in h264-mvc than the normal h264?

Random bystander

Not a lot, in fact the main layer is exactly the same and a normal decoder can just skip over the additional bits (3 NALs more or less) and just decode as usual.

Basically there is a NAL unit to signal which layer we are currently working on, a NAL to store the SPS specific per layer and a NAL to keep the actual frame data.

Beside that everything is exactly the same.

Implementation

So why it isn’t already available, you made it look easy?!

Random jb

Sadly it would be easy if the decoder we have isn’t _that_ convoluted with many components entangled in a monolithic entity, with code that grew over the years to adapt to different needs.

Architectural pain points

Per slice multithreaded decoding made the code quite hard to follow since you then have a master context, h that in certain functions is actually h0 and a slice specific copy hx that sometimes becomes h and such.

Per frame multhtreaded decoding luckily doesn’t get in the way too much for now.

Having to touch a large file of about 4k lines of code in itself isn’t _so_ nice, split view as you like for editing, you end up waiting a single core of you cpu doing the work.

Community constraints

The h264-mvc is a fringe feature for many and if you care about speed you want to not have all the cruft around slowing down. What’s is for you a feature, for many is just cruft.

MVC support must be completely optional or not slow down the normal decoding at all.
MVC support must not make the code harder to follow than it is now, so hacking your way is not an option.
~~MVC should give me a pony, purple~~

The plan

First take the low hanging fruits while you think what’s the best route to achieve your goal.

Random wise person

Refactor

The first step is always refactor and cleanup. As you, hopefully, do not cook on a dirty kitchen, people shouldn’t
write code on top of crufty one.

Split the monster

In Libav everything compiles quite fast beside for vc1(vc1dec.c is 6k loc) and h264(h264.c was around 6k loc).
New codecs such as vp9 or hevc landed already split in smaller chunks.

Shuffling the code should be simple enough, so we had h264.c split in h264_slice.c, h264_mb.c and such. That helps having shorter (re)build time and makes you easier to focus.

Untangle it

Vittorio tried to remove the dependency over the mpeg12 context in order to make easier to follow the code, it was one of the pending issues since years. Now h264 doesn’t require mpeg12 in order to build, that will make probably happier our friends working on Chrome and everybody else needing to have _just_ few selected features in their build.

Pave the road

Once you divided the problem in smaller sub problems (parsing the new nals, store the information in an appropriate data structure, do the actual decoding and store the results somewhere accessible) you can start working on adapting the code to fit. That means reordering some code, splitting functions that would be shared and maybe slay some bugs hidden in the code weed while at it.

So far

We are halfway!

Random optimist

Done

We got the frame splitting, nal parsing pretty much in working shape and is not sent for review just because in itself is not
useful.

Doing

The frame data decoding is pending some patches from me that try to simplify the slice header parsing so enough of it could be shared w/out adding more branches. I hacked it once and I know the approach used works.

The code to store multiple views in a single frame has a whole blueprint being evaluated.

To Do

Test the actual decoding and hopefully make so the frame reference code behaves as expected, this will be probably the most annoying and time consuming task if we are unlucky. That code bites.

Month: April 2014

Security and Tools