{"id":266,"date":"2014-04-04T19:36:51","date_gmt":"2014-04-04T19:36:51","guid":{"rendered":"http:\/\/blogs.gentoo.org\/lu_zero\/?p=266"},"modified":"2014-04-05T10:50:35","modified_gmt":"2014-04-05T10:50:35","slug":"the-road-to-mvc","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/lu_zero\/2014\/04\/04\/the-road-to-mvc\/","title":{"rendered":"The road to MVC"},"content":{"rendered":"<p>In the past month or so I started helping <a href=\"http:\/\/projectsymphony.blogspot.it\">Vittorio<\/a> on adding one of the important missing feature to our h264 decoder. Multi View support.<\/p>\n<h2>MVC<\/h2>\n<p>The basic idea of this feature is quite simple, you are shooting a movie with multiple angles, <b>something<\/b> is bound to be sort of common and you&#8217;d like to ensure frame precision.<\/p>\n<p>So what about encoding all the simultaneous frames captured in the same elementary stream, share across the different layers as much as you could and then let the decoder output the frames somehow?<\/p>\n<p>Since we know that all the containers <a href=\"http:\/\/codecs.multimedia.cx\/?p=676\">have problems<\/a> might be not completely a bogus idea to have the codec taking care of it. Even better if the resulting aggregated bitstream is more compact than the sum of the single ones.<\/p>\n<h3>High level structure<\/h3>\n<blockquote><p>\nWhat&#8217;s different in <b>h264-mvc<\/b> than the normal h264?<\/p>\n<footer><cite>Random bystander<\/cite><\/footer>\n<\/blockquote>\n<p>Not a lot, in fact the main layer is exactly the same and a normal decoder can just skip over the additional bits (3 NALs more or less) and just decode as usual.<\/p>\n<p>Basically there is a NAL unit to signal which layer we are currently working on, a NAL to store the SPS specific per layer and a NAL to keep the actual frame data.<\/p>\n<p>Beside that everything is <b>exactly<\/b> the same.<\/p>\n<h2>Implementation<\/h2>\n<blockquote><p>\nSo why it isn&#8217;t already available, you made it look easy?!<\/p>\n<footer><cite>Random jb<\/cite><\/footer>\n<\/blockquote>\n<p>Sadly it would be easy if the decoder we have isn&#8217;t _that_ convoluted with many components entangled in a monolithic entity, with code that grew over the years to adapt to different needs.<\/p>\n<h3>Architectural pain points<\/h3>\n<p>Per slice multithreaded decoding made the code quite hard to follow since you then have a master context, <b>h<\/b> that in certain functions is actually <b>h0<\/b> and a slice specific copy <b>hx<\/b> that sometimes becomes <b>h<\/b> and such.<\/p>\n<p>Per frame multhtreaded decoding luckily doesn&#8217;t get in the way too much for now.<\/p>\n<p>Having to touch a large file of about 4k lines of code in itself isn&#8217;t _so_ nice, split view as you like for editing, you end up waiting a single core of you cpu doing the work.<\/p>\n<h3>Community constraints<\/h3>\n<p>The <b>h264-mvc<\/b> is a fringe feature for many and if you care about speed you want to not have all the cruft around slowing down. What&#8217;s is for you a feature, for many is just cruft.<\/p>\n<ul>\n<li>MVC support must be completely optional or not slow down the normal decoding at all.<\/li>\n<li>MVC support must not make the code harder to follow than it is now, so hacking your way is not an option.<\/li>\n<li><del>MVC should give me a pony, purple<\/del><\/li>\n<\/ul>\n<h2>The plan<\/h2>\n<blockquote><p>\nFirst take the low hanging fruits while you think what&#8217;s the best route to achieve your goal.<\/p>\n<footer><cite>Random wise person<\/cite><\/footer>\n<\/blockquote>\n<h3>Refactor<\/h3>\n<p>The first step is always <b>refactor and cleanup<\/b>. As you, hopefully, do not cook on a dirty kitchen, people shouldn&#8217;t<br \/>\nwrite code on top of crufty one.<\/p>\n<h3>Split the monster<\/h3>\n<p>In Libav everything compiles quite fast beside for <b>vc1<\/b>(vc1dec.c is 6k loc) and <b>h264<\/b>(h264.c was around 6k loc).<br \/>\nNew codecs such as <b>vp9<\/b> or <b>hevc<\/b> landed already split in smaller chunks.<\/p>\n<p>Shuffling the code should be simple enough, so we had h264.c split in h264_slice.c, h264_mb.c and such. That helps having shorter (re)build time and makes you easier to focus.<\/p>\n<h3>Untangle it<\/h3>\n<p>Vittorio tried to remove the dependency over the mpeg12 context in order to make easier to follow the code, it was one of the pending issues since years. Now h264 doesn&#8217;t require mpeg12 in order to build, that will make probably happier our friends working on Chrome and everybody else needing to have _just_ few selected features in their build.<\/p>\n<h3>Pave the road<\/h3>\n<p>Once you divided the problem in smaller sub problems (parsing the new nals, store the information in an appropriate data structure, do the actual decoding and store the results somewhere accessible) you can start working on adapting the code to fit. That means reordering some code, splitting functions that would be shared and maybe slay some bugs hidden in the code weed while at it.<\/p>\n<h2>So far<\/h2>\n<blockquote><p>\nWe are halfway!<\/p>\n<footer><cite>Random optimist<\/cite><\/footer>\n<\/blockquote>\n<h3>Done<\/h3>\n<p>We got the frame splitting, nal parsing pretty much in working shape and is not sent for review just because in itself is not<br \/>\nuseful.<\/p>\n<h3>Doing<\/h3>\n<p>The frame data decoding is pending some patches from me that try to simplify the slice header parsing so enough of it could be shared w\/out adding more branches. I hacked it once and I know the approach used works.<\/p>\n<p>The code to store multiple views in a single frame has a whole <a href=\"https:\/\/wiki.libav.org\/Blueprint\/MultiAVFrame\">blueprint<\/a> being evaluated.<\/p>\n<h3>To Do<\/h3>\n<p>Test the actual decoding and hopefully make so the frame reference code behaves as expected, this will be probably the most annoying and time consuming task if we are unlucky. That code bites.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the past month or so I started helping Vittorio on adding one of the important missing feature to our h264 decoder. Multi View support. MVC The basic idea of this feature is quite simple, you are shooting a movie with multiple angles, something is bound to be sort of common and you&#8217;d like to &hellip; <a href=\"https:\/\/blogs.gentoo.org\/lu_zero\/2014\/04\/04\/the-road-to-mvc\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">The road to MVC<\/span><\/a><\/p>\n","protected":false},"author":10,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[14,6],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1aGWH-4i","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/266"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/comments?post=266"}],"version-history":[{"count":6,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/266\/revisions"}],"predecessor-version":[{"id":272,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/266\/revisions\/272"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/media?parent=266"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/categories?post=266"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/tags?post=266"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}