{"id":587,"date":"2016-04-01T18:53:34","date_gmt":"2016-04-01T18:53:34","guid":{"rendered":"http:\/\/blogs.gentoo.org\/lu_zero\/?p=587"},"modified":"2016-04-01T19:29:23","modified_gmt":"2016-04-01T19:29:23","slug":"avscale-part1","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/lu_zero\/2016\/04\/01\/avscale-part1\/","title":{"rendered":"AVScale &#8211; part1"},"content":{"rendered":"<p><a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__libsws.html\">swscale<\/a> is one of the most annoying part of Libav, after a couple of years since the initial <a href=\"https:\/\/wiki.libav.org\/Blueprint\/AVScale\">blueprint<\/a> we have something almost functional you can play with.<\/p>\n<h2>Colorspace conversion and Scaling<\/h2>\n<p>Before delving in the library architecture and the outher API probably might be good to make a extra quick summary of what this library is about.<\/p>\n<p>Most multimedia concepts are more or less intuitive:<br \/>\n&#8211; <strong>encoding<\/strong> is taking some data (e.g. video frames, audio samples) and <strong>compress<\/strong> it by leaving out unimportant details<br \/>\n&#8211; <strong>muxing<\/strong> is the act of <strong>storing<\/strong> such compressed data and timestamps so that audio and video can play back in sync<br \/>\n&#8211; <strong>demuxing<\/strong> is getting back the compressed data with the timing information stored in the container format<br \/>\n&#8211; <strong>decoding<\/strong> inflates somehow the data so that video frames can be rendered on screen and the audio played on the speakers<\/p>\n<p>After the decoding step would seem that all the hard work is done, but since there isn&#8217;t a single way to store video <strong>pixels<\/strong> or audio <strong>samples<\/strong> you need to process them so they work with your output devices.<\/p>\n<p>That process is usually called <strong>resampling<\/strong> for audio and for video we have <strong>colorspace conversion<\/strong> to change the pixel information and <strong>scaling<\/strong> to change the amount of pixels in the image.<\/p>\n<p>Today I&#8217;ll introduce you to the new library for colorspace conversion and scaling we are working on.<\/p>\n<h2>AVScale<\/h2>\n<p>The library aims to be as simple as possible and hide all the gory details from the user, you won&#8217;t need to figure the heads and tails of functions with a quite <a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__libsws.html#gaf360d1a9e0e60f906f74d7d44f9abfdd\">large<\/a> <a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__libsws.html#gadffa09f208a3eba7fa3a6b1f74ab77f7\">amount<\/a> of <a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__libsws.html#gae531c9754c9205d90ad6800015046d74\">arguments<\/a> nor <a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__libsws.html#ga2a140989dfed29dd91065352b6a52840\">special-purpose<\/a> functions.<\/p>\n<p>The API itself is modelled after <a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__lavr.html\">avresample<\/a> and approaches the problem of conversion and scaling in a way quite different from <a href=\"https:\/\/libav.org\/documentation\/doxygen\/master\/group__libsws.html\">swscale<\/a>, following the same design of <a href=\"http:\/\/codecs.multimedia.cx\/?p=1057\">NAScale<\/a>.<\/p>\n<h3>Everything is a Kernel<\/h3>\n<p>One of the key concept of <strong>AVScale<\/strong> is that the conversion chain is assembled out of different components, separating the concerns.<\/p>\n<p>Those components are called kernels.<\/p>\n<p>The kernels can be conceptually divided in two kinds:<br \/>\n&#8211; <strong>Conversion<\/strong> kernels, taking an input in a certain format and providing an output in another (e.g. <strong>rgb2yuv<\/strong>) without changing any other property.<br \/>\n&#8211; <strong>Process<\/strong> kernels, modifying the data while keeping the format itself unchanged (e.g. <strong>scale<\/strong>)<\/p>\n<p>This pipeline approach gets great flexibility and helps code reuse.<\/p>\n<p>The most common use-cases (such as scaling without conversion or conversion with out scaling) can be faster than solutions trying to merge together scaling and conversion in a single step.<\/p>\n<h3>API<\/h3>\n<p>AVScale works with two kind of structures:<br \/>\n&#8211; <strong>AVPixelFormaton<\/strong>: A full description of the pixel format<br \/>\n&#8211; <strong>AVFrame<\/strong>: The frame data, its dimension and a reference to its format details (aka <strong>AVPixelFormaton<\/strong>)<\/p>\n<p>The library will have an <strong>AVOption<\/strong>-based system to tune specific options (e.g. selecting the scaling algorithm).<\/p>\n<p>For now only <code>avscale_config<\/code> and <code>avscale_convert_frame<\/code> are implemented.<\/p>\n<p>So if the input and output are pre-determined the context can be configured like this:<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"n\">AVScaleContext<\/span> <span class=\"o\">*<\/span><span class=\"n\">ctx<\/span> <span class=\"o\">=<\/span> <span class=\"n\">avscale_alloc_context<\/span><span class=\"p\">();<\/span>\n\n<span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"o\">!<\/span><span class=\"n\">ctx<\/span><span class=\"p\">)<\/span>\n    <span class=\"p\">...<\/span>\n\n<span class=\"n\">ret<\/span> <span class=\"o\">=<\/span> <span class=\"n\">avscale_config<\/span><span class=\"p\">(<\/span><span class=\"n\">ctx<\/span><span class=\"p\">,<\/span> <span class=\"n\">out<\/span><span class=\"p\">,<\/span> <span class=\"n\">in<\/span><span class=\"p\">);<\/span>\n<span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">ret<\/span> <span class=\"o\">&lt;<\/span> <span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n    <span class=\"p\">...<\/span>\n<\/pre>\n<\/div>\n<p>But you can skip it and scale and\/or convert from a input to an output like this:<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"n\">AVScaleContext<\/span> <span class=\"o\">*<\/span><span class=\"n\">ctx<\/span> <span class=\"o\">=<\/span> <span class=\"n\">avscale_alloc_context<\/span><span class=\"p\">();<\/span>\n\n<span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"o\">!<\/span><span class=\"n\">ctx<\/span><span class=\"p\">)<\/span>\n    <span class=\"p\">...<\/span>\n\n<span class=\"n\">ret<\/span> <span class=\"o\">=<\/span> <span class=\"n\">avscale_convert_frame<\/span><span class=\"p\">(<\/span><span class=\"n\">ctx<\/span><span class=\"p\">,<\/span> <span class=\"n\">out<\/span><span class=\"p\">,<\/span> <span class=\"n\">in<\/span><span class=\"p\">);<\/span>\n<span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">ret<\/span> <span class=\"o\">&lt;<\/span> <span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n    <span class=\"p\">...<\/span>\n\n<span class=\"n\">avscale_free<\/span><span class=\"p\">(<\/span><span class=\"o\">&amp;<\/span><span class=\"n\">ctx<\/span><span class=\"p\">);<\/span>\n<\/pre>\n<\/div>\n<p>The context gets lazily configured on the first call.<\/p>\n<p>Notice that <code>avscale_free()<\/code> takes a pointer to a pointer, to make sure the context pointer does not stay dangling.<\/p>\n<p>As said the API is <em>really<\/em> simple and essential.<\/p>\n<h2>Help welcome!<\/h2>\n<p><a href=\"http:\/\/codecs.multimedia.cx\/?p=1035\">Kostya<\/a> kindly provided an initial <a href=\"https:\/\/github.com\/lu-zero\/avscale\">proof of concept<\/a> and me, Vittorio and Anton prepared this <a href=\"https:\/\/github.com\/lu-zero\/libav\/commits\/avscale\">preview<\/a> on the spare time. There is plenty left to do, if you like the idea (since <strong>many<\/strong> kept telling they would love a <strong>swscale<\/strong> replacement) we even have a <a href=\"https:\/\/www.bountysource.com\/teams\/libav\/fundraisers\/480-libavscale\">fundraiser<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>swscale is one of the most annoying part of Libav, after a couple of years since the initial blueprint we have something almost functional you can play with. Colorspace conversion and Scaling Before delving in the library architecture and the outher API probably might be good to make a extra quick summary of what this &hellip; <a href=\"https:\/\/blogs.gentoo.org\/lu_zero\/2016\/04\/01\/avscale-part1\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">AVScale &#8211; part1<\/span><\/a><\/p>\n","protected":false},"author":10,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[14],"tags":[19],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1aGWH-9t","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/587"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/comments?post=587"}],"version-history":[{"count":3,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/587\/revisions"}],"predecessor-version":[{"id":590,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/587\/revisions\/590"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/media?parent=587"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/categories?post=587"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/tags?post=587"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}