{"id":359,"date":"2014-11-15T13:40:58","date_gmt":"2014-11-15T13:40:58","guid":{"rendered":"http:\/\/blogs.gentoo.org\/lu_zero\/?p=359"},"modified":"2014-12-02T16:46:22","modified_gmt":"2014-12-02T16:46:22","slug":"making-a-new-demuxer","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/lu_zero\/2014\/11\/15\/making-a-new-demuxer\/","title":{"rendered":"Making a new demuxer"},"content":{"rendered":"<p><a href=\"https:\/\/github.com\/maksbotan\">Maxim<\/a> asked me to to check a stream from a security camera that he could not decode with <a href=\"http:\/\/libav.org\/avconv.html\">avconv<\/a> without <a href=\"https:\/\/libav.org\/avconv.html#Main-options\">forcing<\/a> the <strong>format<\/strong> to <code>mjpeg<\/code>.<\/p>\n<h2>Mysterious stream<\/h2>\n<p>Since it is served as <strong>http<\/strong> the first step had been checking the <strong>mime<\/strong> type. Time to use <code>curl -I<\/code>.<\/p>\n<pre><code># curl -I \"http:\/\/host\/some.cgi?user=admin&amp;amp;pwd=pwd\" | grep Content-Type\n<\/code><\/pre>\n<p>Interesting enough it is a <code>multipart\/x-mixed-replace<\/code><\/p>\n<pre><code>Content-Type: multipart\/x-mixed-replace;boundary=object-ipcamera\n<\/code><\/pre>\n<p>Basically the cgi sends a <strong>jpeg<\/strong> images one after the other, we even have a (old and ugly) <strong>muxer<\/strong> for it!<\/p>\n<p>Time to write a demuxer.<\/p>\n<h2>Libav demuxers<\/h2>\n<p>We already have some <a href=\"https:\/\/wiki.libav.org\/Internals\/DemuxerHowTo\">documentation<\/a> on how to write a demuxer, but it is not complete so this blogpost will provide an example.<\/p>\n<h3>Basics<\/h3>\n<p>Libav code is quite object oriented: every component is a <strong>C structure<\/strong> containing a description of it and pointers to a set of functions and there are fixed pattern to make easier to make new code fit in.<\/p>\n<p>Every major library has an <code>all${components}.c<\/code> in which the components are registered to be used. In our case we talk about <strong>libavformat<\/strong> so we have <code>allformats.c<\/code>.<\/p>\n<p>The components are built according to <code>CONFIG_${name}_${component}<\/code> variables generated by <strong>configure<\/strong>. The actual code reside in the <code>${component}<\/code> directory with a pattern such as <code>${name}.c<\/code> or <code>${name}dec.c<\/code>\/<code>${name}enc.c<\/code> if both demuxer and muxer are available.<\/p>\n<p>The code can be split in multiple files if it starts growing to an excess of <strong>500-1000<\/strong> LOCs.<\/p>\n<h4>Registration<\/h4>\n<p>We have some <code>REGISTER_<\/code> macros that abstract some logic to make every component selectable at configure time since in Libav you can enable\/disable every muxer, demuxer, codec, IO\/protocol from <strong>configure<\/strong>.<\/p>\n<p>We had already have a <strong>muxer<\/strong> for the format.<\/p>\n<pre><code class=\"c\">    REGISTER_MUXER   (MPJPEG,           mpjpeg);\n<\/code><\/pre>\n<p>Now we register both in a single line:<\/p>\n<pre><code class=\"c\">    REGISTER_MUXDEMUX(MPJPEG,           mpjpeg);\n<\/code><\/pre>\n<p>The <code>all${components}<\/code> files are parsed by <strong>configure<\/strong> to generate the appropriate <strong>Makefile<\/strong> and <strong>C<\/strong> definitions. The next run we&#8217;ll get a new<br \/>\n<code>CONFIG_MPJPEG_DEMUXER<\/code> variable in <code>config.mak<\/code> and <code>config.h<\/code>.<\/p>\n<p>Now we can add to <code>libavformat\/Makefile<\/code> a line like<\/p>\n<pre><code class=\"make\">OBJS-$(CONFIG_MPJPEG_DEMUXER)            += mpjpegdec.o\n<\/code><\/pre>\n<p>and put our <code>mpjpegdec.c<\/code> in <code>libavformat<\/code> and we are ready to write some code!<\/p>\n<h3>Demuxer structure<\/h3>\n<p>Usually I start putting down a skeleton file with the bare minimum:<\/p>\n<p>The <a href=\"https:\/\/libav.org\/doxygen\/master\/structAVInputFormat.html\">AVInputFormat<\/a> and the core <code>_read_probe<\/code>, <code>_read_header<\/code> and <code>_read_packet<\/code> callbacks.<\/p>\n<pre><code class=\"c\">#include \"avformat.h\"\n\nstatic int ${name}_read_probe(AVProbeData *p)\n{\n    return 0;\n}\n\nstatic int ${name}_read_header(AVFormatContext *s)\n{\n    return AVERROR(ENOSYS);\n}\n\nstatic int ${name}_read_packet(AVFormatContext *s, AVPacket *pkt)\n{\n    return AVERROR(ENOSYS);\n}\n\nAVInputFormat ff_${name}_demuxer = {\n    .name           = \"${name}\",\n    .long_name      = NULL_IF_CONFIG_SMALL(\"Longer ${name} description\"),\n    .read_probe     = ${name}_read_probe,\n    .read_header    = ${name}_read_header,\n    .read_packet    = ${name}_read_packet,\n<\/code><\/pre>\n<p>I make so that all the functions return a <strong>no-op<\/strong> value.<\/p>\n<h3>_read_probe<\/h3>\n<p>This function will be called by the <a href=\"https:\/\/libav.org\/doxygen\/master\/group__lavf__decoding.html\">av_probe_input<\/a> functions, it receives some <strong>probe information<\/strong> in the form of a <strong>buffer<\/strong>. The function return a <strong>score<\/strong> between 0 and 100; <code>AVPROBE_SCORE_MAX<\/code>, <code>AVPROBE_SCORE_MIME<\/code> and <code>AVPROBE_SCORE_EXTENSION<\/code> are provided to make more evident what is the expected confidence. 0 means that we are sure that the probed stream is <strong>not<\/strong> parsable by this demuxer.<\/p>\n<h3>_read_header<\/h3>\n<p>This function will be called by <a href=\"https:\/\/libav.org\/doxygen\/master\/group__lavf__decoding.html\">avformat_open_input<\/a>. It reads the initial format information (e.g. number and kind of streams) when available, in this function the initial set of <strong>streams<\/strong> should be mapped with <a href=\"https:\/\/libav.org\/doxygen\/master\/group__lavf__core.html\">avformat_new_stream<\/a>. Must return <code>0<\/code> on success. The skeleton is made to return <code>ENOSYS<\/code> so it can be run and just exit cleanly.<\/p>\n<h3>_read_packet<\/h3>\n<p>This function will be called by <a href=\"https:\/\/libav.org\/doxygen\/master\/group__lavf__decoding.htm\">av_read_frame<\/a>. It should return an <a href=\"https:\/\/libav.org\/doxygen\/master\/structAVPacket.html\">AVPacket<\/a> containing demuxed data as contained in the bytestream. It will be parsed and collated (or splitted) to a frame-worth amount of data by the optional <a href=\"https:\/\/libav.org\/doxygen\/master\/group__lavc__parsing.html\">parsers<\/a>. Must return <code>0<\/code> on success. The skeleton again returns <code>ENOSYS<\/code>.<\/p>\n<h2>Implementation<\/h2>\n<p>Now let&#8217;s implement the mpjpeg support! The format in itself is quite simple:<br \/>\n&#8211; a boundary line starting with <code>--<\/code><br \/>\n&#8211; a <code>Content-Type<\/code> line stating <code>image\/jpeg<\/code>.<br \/>\n&#8211; a <code>Content-Length<\/code> line with the actual buffer length.<br \/>\n&#8211; the jpeg data<\/p>\n<h3>Probe function<\/h3>\n<p>We just want to check if the <code>Content-Type<\/code> is what we expect basically, so we just<br \/>\ngo over the lines (<code>\\n\\r<\/code>-separated) and check if there is a <strong>tag<\/strong> <code>Content-Type<\/code> with a <strong>value<\/strong> <code>image\/jpeg<\/code>.<\/p>\n<pre><code class=\"c\">static int get_line(AVIOContext *pb, char *line, int line_size)\n{\n    int i, ch;\n    char *q = line;\n\n    for (i = 0; !pb-&gt;eof_reached; i++) {\n        ch = avio_r8(pb);\n        if (ch == 'n') {\n            if (q &gt; line &amp;&amp; q[-1] == 'r')\n                q--;\n            *q = '';\n\n            return 0;\n        } else {\n            if ((q - line) &lt; line_size - 1)\n                *q++ = ch;\n        }\n    }\n\n    if (pb-&gt;error)\n        return pb-&gt;error;\n    return AVERROR_EOF;\n}\n\nstatic int split_tag_value(char **tag, char **value, char *line)\n{\n    char *p = line;\n\n    while (*p != '' &amp;&amp; *p != ':')\n        p++;\n    if (*p != ':')\n        return AVERROR_INVALIDDATA;\n\n    *p   = '';\n    *tag = line;\n\n    p++;\n\n    while (av_isspace(*p))\n        p++;\n\n    *value = p;\n\n    return 0;\n}\n\nstatic int check_content_type(char *line)\n{\n    char *tag, *value;\n    int ret = split_tag_value(&amp;tag, &amp;value, line);\n\n    if (ret &lt; 0)\n        return ret;\n\n    if (av_strcasecmp(tag, \"Content-type\") ||\n        av_strcasecmp(value, \"image\/jpeg\"))\n        return AVERROR_INVALIDDATA;\n\n    return 0;\n}\n\nstatic int mpjpeg_read_probe(AVProbeData *p)\n{\n    AVIOContext *pb;\n    char line[128] = { 0 };\n    int ret;\n\n    pb = avio_alloc_context(p-&gt;buf, p-&gt;buf_size, 0, NULL, NULL, NULL, NULL);\n    if (!pb)\n        return AVERROR(ENOMEM);\n\n    while (!pb-&gt;eof_reached) {\n        ret = get_line(pb, line, sizeof(line));\n        if (ret &lt; 0)\n            break;\n\n        ret = check_content_type(line);\n        if (!ret)\n            return AVPROBE_SCORE_MAX;\n    }\n\n    return 0;\n}\n<\/code><\/pre>\n<p>Here we are using <a href=\"https:\/\/libav.org\/doxygen\/master\/structAVIOContext.html\">avio<\/a> to be able to reuse <code>get_line<\/code> later.<\/p>\n<h3>Reading the header<\/h3>\n<p>The format is pretty much header-less, we just check for the boundary for now and<br \/>\nset up the minimum amount of information regarding the stream: <strong>media type<\/strong>, <strong>codec id<\/strong> and <strong>frame rate<\/strong>. The boundary by specification is less than 70 characters with <code>--<\/code> as initial marker.<\/p>\n<pre><code class=\"c\">static int mpjpeg_read_header(AVFormatContext *s)\n{\n    MPJpegContext *mp = s-&gt;priv_data;\n    AVStream *st;\n    char boundary[70 + 2 + 1];\n    int ret;\n\n    ret = get_line(s-&gt;pb, boundary, sizeof(boundary));\n    if (ret &lt; 0)\n        return ret;\n\n    if (strncmp(boundary, \"--\", 2))\n        return AVERROR_INVALIDDATA;\n\n    st = avformat_new_stream(s, NULL);\n\n    st-&gt;codec-&gt;codec_type = AVMEDIA_TYPE_VIDEO;\n    st-&gt;codec-&gt;codec_id   = AV_CODEC_ID_MJPEG;\n\n    avpriv_set_pts_info(st, 60, 1, 25);\n\n    return 0;\n}\n<\/code><\/pre>\n<h3>Reading packets<\/h3>\n<p>Even this function is quite simple, please note that <code>AVFormatContext<\/code> provides an<br \/>\n<code>AVIOContext<\/code>. The bulk of the function boils down to reading the size of the frame,<br \/>\nallocate a packet using <code>av_new_packet<\/code> and write down if using <code>avio_read<\/code>.<\/p>\n<pre><code class=\"c\">static int parse_content_length(char *line)\n{\n    char *tag, *value;\n    int ret = split_tag_value(&amp;tag, &amp;value, line);\n    long int val;\n\n    if (ret &lt; 0)\n        return ret;\n\n    if (av_strcasecmp(tag, \"Content-Length\"))\n        return AVERROR_INVALIDDATA;\n\n    val = strtol(value, NULL, 10);\n    if (val == LONG_MIN || val == LONG_MAX)\n        return AVERROR(errno);\n    if (val &gt; INT_MAX)\n        return AVERROR(ERANGE);\n    return val;\n}\n\nstatic int mpjpeg_read_packet(AVFormatContext *s, AVPacket *pkt)\n{\n    char line[128];\n    int ret, size;\n\n    ret = get_line(s-&gt;pb, line, sizeof(line));\n    if (ret &lt; 0)\n        return ret;\n\n    ret = check_content_type(line);\n    if (ret &lt; 0)\n        return ret;\n\n    ret = get_line(s-&gt;pb, line, sizeof(line));\n    if (ret &lt; 0)\n        return ret;\n\n    size = parse_content_length(line);\n    if (size &lt; 0)\n        return size;\n\n    ret = get_line(s-&gt;pb, line, sizeof(line));\n    if (ret &lt; 0)\n        goto fail;\n\n    ret = av_new_packet(pkt, size);\n    if (ret &lt; 0)\n        return ret;\n\n    ret = avio_read(s-&gt;pb, pkt-&gt;data, size);\n    if (ret &lt; 0)\n        goto fail;\n\n    ret = get_line(s-&gt;pb, line, sizeof(line));\n    if (ret &lt; 0)\n        goto fail;\n\n    \/\/ Consume the boundary marker\n    ret = get_line(s-&gt;pb, line, sizeof(line));\n    if (ret &lt; 0)\n        goto fail;\n\n    return ret;\n\nfail:\n    av_free_packet(pkt);\n    return ret;\n}\n\n<\/code><\/pre>\n<h2>What next<\/h2>\n<p>For now I walked you through on the fundamentals, hopefully next week I&#8217;ll show you some additional features I&#8217;ll need to implement in this simple demuxer to make it land in Libav: <a href=\"https:\/\/libav.org\/doxygen\/master\/structAVOption.html\">AVOptions<\/a> to make possible overriding the <strong>framerate<\/strong> and some additional code to be able to do without <code>Content-Length<\/code> and just use the <strong>boundary<\/strong> line.<\/p>\n<blockquote><p>\n  PS: wordpress support for syntax highlight is quite subpar, if somebody has a blog engine that can use pygments or equivalent please tell me and I&#8217;d switch to it.\n<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Maxim asked me to to check a stream from a security camera that he could not decode with avconv without forcing the format to mjpeg. Mysterious stream Since it is served as http the first step had been checking the mime type. Time to use curl -I. # curl -I &#8220;http:\/\/host\/some.cgi?user=admin&amp;amp;pwd=pwd&#8221; | grep Content-Type Interesting &hellip; <a href=\"https:\/\/blogs.gentoo.org\/lu_zero\/2014\/11\/15\/making-a-new-demuxer\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Making a new demuxer<\/span><\/a><\/p>\n","protected":false},"author":10,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[14],"tags":[18],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1aGWH-5N","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/359"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/comments?post=359"}],"version-history":[{"count":11,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/359\/revisions"}],"predecessor-version":[{"id":385,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/359\/revisions\/385"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/media?parent=359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/categories?post=359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/tags?post=359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}