Making a new demuxer

Maxim asked me to to check a stream from a security camera that he could not decode with avconv without forcing the format to mjpeg.

Mysterious stream

Since it is served as http the first step had been checking the mime type. Time to use curl -I.

# curl -I "http://host/some.cgi?user=admin&amp;pwd=pwd" | grep Content-Type

Interesting enough it is a multipart/x-mixed-replace

Content-Type: multipart/x-mixed-replace;boundary=object-ipcamera

Basically the cgi sends a jpeg images one after the other, we even have a (old and ugly) muxer for it!

Time to write a demuxer.

Libav demuxers

We already have some documentation on how to write a demuxer, but it is not complete so this blogpost will provide an example.

Basics

Libav code is quite object oriented: every component is a C structure containing a description of it and pointers to a set of functions and there are fixed pattern to make easier to make new code fit in.

Every major library has an all${components}.c in which the components are registered to be used. In our case we talk about libavformat so we have allformats.c.

The components are built according to CONFIG_${name}_${component} variables generated by configure. The actual code reside in the ${component} directory with a pattern such as ${name}.c or ${name}dec.c/${name}enc.c if both demuxer and muxer are available.

The code can be split in multiple files if it starts growing to an excess of 500-1000 LOCs.

Registration

We have some REGISTER_ macros that abstract some logic to make every component selectable at configure time since in Libav you can enable/disable every muxer, demuxer, codec, IO/protocol from configure.

We had already have a muxer for the format.

    REGISTER_MUXER   (MPJPEG,           mpjpeg);

Now we register both in a single line:

    REGISTER_MUXDEMUX(MPJPEG,           mpjpeg);

The all${components} files are parsed by configure to generate the appropriate Makefile and C definitions. The next run we’ll get a new
CONFIG_MPJPEG_DEMUXER variable in config.mak and config.h.

Now we can add to libavformat/Makefile a line like

OBJS-$(CONFIG_MPJPEG_DEMUXER)            += mpjpegdec.o

and put our mpjpegdec.c in libavformat and we are ready to write some code!

Demuxer structure

Usually I start putting down a skeleton file with the bare minimum:

The AVInputFormat and the core _read_probe, _read_header and _read_packet callbacks.

#include "avformat.h"

static int ${name}_read_probe(AVProbeData *p)
{
    return 0;
}

static int ${name}_read_header(AVFormatContext *s)
{
    return AVERROR(ENOSYS);
}

static int ${name}_read_packet(AVFormatContext *s, AVPacket *pkt)
{
    return AVERROR(ENOSYS);
}

AVInputFormat ff_${name}_demuxer = {
    .name           = "${name}",
    .long_name      = NULL_IF_CONFIG_SMALL("Longer ${name} description"),
    .read_probe     = ${name}_read_probe,
    .read_header    = ${name}_read_header,
    .read_packet    = ${name}_read_packet,

I make so that all the functions return a no-op value.

_read_probe

This function will be called by the av_probe_input functions, it receives some probe information in the form of a buffer. The function return a score between 0 and 100; AVPROBE_SCORE_MAX, AVPROBE_SCORE_MIME and AVPROBE_SCORE_EXTENSION are provided to make more evident what is the expected confidence. 0 means that we are sure that the probed stream is not parsable by this demuxer.

_read_header

This function will be called by avformat_open_input. It reads the initial format information (e.g. number and kind of streams) when available, in this function the initial set of streams should be mapped with avformat_new_stream. Must return 0 on success. The skeleton is made to return ENOSYS so it can be run and just exit cleanly.

_read_packet

This function will be called by av_read_frame. It should return an AVPacket containing demuxed data as contained in the bytestream. It will be parsed and collated (or splitted) to a frame-worth amount of data by the optional parsers. Must return 0 on success. The skeleton again returns ENOSYS.

Implementation

Now let’s implement the mpjpeg support! The format in itself is quite simple:
– a boundary line starting with --
– a Content-Type line stating image/jpeg.
– a Content-Length line with the actual buffer length.
– the jpeg data

Probe function

We just want to check if the Content-Type is what we expect basically, so we just
go over the lines (\n\r-separated) and check if there is a tag Content-Type with a value image/jpeg.

static int get_line(AVIOContext *pb, char *line, int line_size)
{
    int i, ch;
    char *q = line;

    for (i = 0; !pb->eof_reached; i++) {
        ch = avio_r8(pb);
        if (ch == 'n') {
            if (q > line && q[-1] == 'r')
                q--;
            *q = '';

            return 0;
        } else {
            if ((q - line) < line_size - 1)
                *q++ = ch;
        }
    }

    if (pb->error)
        return pb->error;
    return AVERROR_EOF;
}

static int split_tag_value(char **tag, char **value, char *line)
{
    char *p = line;

    while (*p != '' && *p != ':')
        p++;
    if (*p != ':')
        return AVERROR_INVALIDDATA;

    *p   = '';
    *tag = line;

    p++;

    while (av_isspace(*p))
        p++;

    *value = p;

    return 0;
}

static int check_content_type(char *line)
{
    char *tag, *value;
    int ret = split_tag_value(&tag, &value, line);

    if (ret < 0)
        return ret;

    if (av_strcasecmp(tag, "Content-type") ||
        av_strcasecmp(value, "image/jpeg"))
        return AVERROR_INVALIDDATA;

    return 0;
}

static int mpjpeg_read_probe(AVProbeData *p)
{
    AVIOContext *pb;
    char line[128] = { 0 };
    int ret;

    pb = avio_alloc_context(p->buf, p->buf_size, 0, NULL, NULL, NULL, NULL);
    if (!pb)
        return AVERROR(ENOMEM);

    while (!pb->eof_reached) {
        ret = get_line(pb, line, sizeof(line));
        if (ret < 0)
            break;

        ret = check_content_type(line);
        if (!ret)
            return AVPROBE_SCORE_MAX;
    }

    return 0;
}

Here we are using avio to be able to reuse get_line later.

Reading the header

The format is pretty much header-less, we just check for the boundary for now and
set up the minimum amount of information regarding the stream: media type, codec id and frame rate. The boundary by specification is less than 70 characters with -- as initial marker.

static int mpjpeg_read_header(AVFormatContext *s)
{
    MPJpegContext *mp = s->priv_data;
    AVStream *st;
    char boundary[70 + 2 + 1];
    int ret;

    ret = get_line(s->pb, boundary, sizeof(boundary));
    if (ret < 0)
        return ret;

    if (strncmp(boundary, "--", 2))
        return AVERROR_INVALIDDATA;

    st = avformat_new_stream(s, NULL);

    st->codec->codec_type = AVMEDIA_TYPE_VIDEO;
    st->codec->codec_id   = AV_CODEC_ID_MJPEG;

    avpriv_set_pts_info(st, 60, 1, 25);

    return 0;
}

Reading packets

Even this function is quite simple, please note that AVFormatContext provides an
AVIOContext. The bulk of the function boils down to reading the size of the frame,
allocate a packet using av_new_packet and write down if using avio_read.

static int parse_content_length(char *line)
{
    char *tag, *value;
    int ret = split_tag_value(&tag, &value, line);
    long int val;

    if (ret < 0)
        return ret;

    if (av_strcasecmp(tag, "Content-Length"))
        return AVERROR_INVALIDDATA;

    val = strtol(value, NULL, 10);
    if (val == LONG_MIN || val == LONG_MAX)
        return AVERROR(errno);
    if (val > INT_MAX)
        return AVERROR(ERANGE);
    return val;
}

static int mpjpeg_read_packet(AVFormatContext *s, AVPacket *pkt)
{
    char line[128];
    int ret, size;

    ret = get_line(s->pb, line, sizeof(line));
    if (ret < 0)
        return ret;

    ret = check_content_type(line);
    if (ret < 0)
        return ret;

    ret = get_line(s->pb, line, sizeof(line));
    if (ret < 0)
        return ret;

    size = parse_content_length(line);
    if (size < 0)
        return size;

    ret = get_line(s->pb, line, sizeof(line));
    if (ret < 0)
        goto fail;

    ret = av_new_packet(pkt, size);
    if (ret < 0)
        return ret;

    ret = avio_read(s->pb, pkt->data, size);
    if (ret < 0)
        goto fail;

    ret = get_line(s->pb, line, sizeof(line));
    if (ret < 0)
        goto fail;

    // Consume the boundary marker
    ret = get_line(s->pb, line, sizeof(line));
    if (ret < 0)
        goto fail;

    return ret;

fail:
    av_free_packet(pkt);
    return ret;
}

What next

For now I walked you through on the fundamentals, hopefully next week I’ll show you some additional features I’ll need to implement in this simple demuxer to make it land in Libav: AVOptions to make possible overriding the framerate and some additional code to be able to do without Content-Length and just use the boundary line.

PS: wordpress support for syntax highlight is quite subpar, if somebody has a blog engine that can use pygments or equivalent please tell me and I’d switch to it.