Maxim asked me to to check a stream from a security camera that he could not decode with avconv without forcing the format to mjpeg
.
Mysterious stream
Since it is served as http the first step had been checking the mime type. Time to use curl -I
.
# curl -I "http://host/some.cgi?user=admin&pwd=pwd" | grep Content-Type
Interesting enough it is a multipart/x-mixed-replace
Content-Type: multipart/x-mixed-replace;boundary=object-ipcamera
Basically the cgi sends a jpeg images one after the other, we even have a (old and ugly) muxer for it!
Time to write a demuxer.
Libav demuxers
We already have some documentation on how to write a demuxer, but it is not complete so this blogpost will provide an example.
Basics
Libav code is quite object oriented: every component is a C structure containing a description of it and pointers to a set of functions and there are fixed pattern to make easier to make new code fit in.
Every major library has an all${components}.c
in which the components are registered to be used. In our case we talk about libavformat so we have allformats.c
.
The components are built according to CONFIG_${name}_${component}
variables generated by configure. The actual code reside in the ${component}
directory with a pattern such as ${name}.c
or ${name}dec.c
/${name}enc.c
if both demuxer and muxer are available.
The code can be split in multiple files if it starts growing to an excess of 500-1000 LOCs.
Registration
We have some REGISTER_
macros that abstract some logic to make every component selectable at configure time since in Libav you can enable/disable every muxer, demuxer, codec, IO/protocol from configure.
We had already have a muxer for the format.
REGISTER_MUXER (MPJPEG, mpjpeg);
Now we register both in a single line:
REGISTER_MUXDEMUX(MPJPEG, mpjpeg);
The all${components}
files are parsed by configure to generate the appropriate Makefile and C definitions. The next run we’ll get a new
CONFIG_MPJPEG_DEMUXER
variable in config.mak
and config.h
.
Now we can add to libavformat/Makefile
a line like
OBJS-$(CONFIG_MPJPEG_DEMUXER) += mpjpegdec.o
and put our mpjpegdec.c
in libavformat
and we are ready to write some code!
Demuxer structure
Usually I start putting down a skeleton file with the bare minimum:
The AVInputFormat and the core _read_probe
, _read_header
and _read_packet
callbacks.
#include "avformat.h"
static int ${name}_read_probe(AVProbeData *p)
{
return 0;
}
static int ${name}_read_header(AVFormatContext *s)
{
return AVERROR(ENOSYS);
}
static int ${name}_read_packet(AVFormatContext *s, AVPacket *pkt)
{
return AVERROR(ENOSYS);
}
AVInputFormat ff_${name}_demuxer = {
.name = "${name}",
.long_name = NULL_IF_CONFIG_SMALL("Longer ${name} description"),
.read_probe = ${name}_read_probe,
.read_header = ${name}_read_header,
.read_packet = ${name}_read_packet,
I make so that all the functions return a no-op value.
_read_probe
This function will be called by the av_probe_input functions, it receives some probe information in the form of a buffer. The function return a score between 0 and 100; AVPROBE_SCORE_MAX
, AVPROBE_SCORE_MIME
and AVPROBE_SCORE_EXTENSION
are provided to make more evident what is the expected confidence. 0 means that we are sure that the probed stream is not parsable by this demuxer.
_read_header
This function will be called by avformat_open_input. It reads the initial format information (e.g. number and kind of streams) when available, in this function the initial set of streams should be mapped with avformat_new_stream. Must return 0
on success. The skeleton is made to return ENOSYS
so it can be run and just exit cleanly.
_read_packet
This function will be called by av_read_frame. It should return an AVPacket containing demuxed data as contained in the bytestream. It will be parsed and collated (or splitted) to a frame-worth amount of data by the optional parsers. Must return 0
on success. The skeleton again returns ENOSYS
.
Implementation
Now let’s implement the mpjpeg support! The format in itself is quite simple:
– a boundary line starting with --
– a Content-Type
line stating image/jpeg
.
– a Content-Length
line with the actual buffer length.
– the jpeg data
Probe function
We just want to check if the Content-Type
is what we expect basically, so we just
go over the lines (\n\r
-separated) and check if there is a tag Content-Type
with a value image/jpeg
.
static int get_line(AVIOContext *pb, char *line, int line_size)
{
int i, ch;
char *q = line;
for (i = 0; !pb->eof_reached; i++) {
ch = avio_r8(pb);
if (ch == 'n') {
if (q > line && q[-1] == 'r')
q--;
*q = '';
return 0;
} else {
if ((q - line) < line_size - 1)
*q++ = ch;
}
}
if (pb->error)
return pb->error;
return AVERROR_EOF;
}
static int split_tag_value(char **tag, char **value, char *line)
{
char *p = line;
while (*p != '' && *p != ':')
p++;
if (*p != ':')
return AVERROR_INVALIDDATA;
*p = '';
*tag = line;
p++;
while (av_isspace(*p))
p++;
*value = p;
return 0;
}
static int check_content_type(char *line)
{
char *tag, *value;
int ret = split_tag_value(&tag, &value, line);
if (ret < 0)
return ret;
if (av_strcasecmp(tag, "Content-type") ||
av_strcasecmp(value, "image/jpeg"))
return AVERROR_INVALIDDATA;
return 0;
}
static int mpjpeg_read_probe(AVProbeData *p)
{
AVIOContext *pb;
char line[128] = { 0 };
int ret;
pb = avio_alloc_context(p->buf, p->buf_size, 0, NULL, NULL, NULL, NULL);
if (!pb)
return AVERROR(ENOMEM);
while (!pb->eof_reached) {
ret = get_line(pb, line, sizeof(line));
if (ret < 0)
break;
ret = check_content_type(line);
if (!ret)
return AVPROBE_SCORE_MAX;
}
return 0;
}
Here we are using avio to be able to reuse get_line
later.
Reading the header
The format is pretty much header-less, we just check for the boundary for now and
set up the minimum amount of information regarding the stream: media type, codec id and frame rate. The boundary by specification is less than 70 characters with --
as initial marker.
static int mpjpeg_read_header(AVFormatContext *s)
{
MPJpegContext *mp = s->priv_data;
AVStream *st;
char boundary[70 + 2 + 1];
int ret;
ret = get_line(s->pb, boundary, sizeof(boundary));
if (ret < 0)
return ret;
if (strncmp(boundary, "--", 2))
return AVERROR_INVALIDDATA;
st = avformat_new_stream(s, NULL);
st->codec->codec_type = AVMEDIA_TYPE_VIDEO;
st->codec->codec_id = AV_CODEC_ID_MJPEG;
avpriv_set_pts_info(st, 60, 1, 25);
return 0;
}
Reading packets
Even this function is quite simple, please note that AVFormatContext
provides an
AVIOContext
. The bulk of the function boils down to reading the size of the frame,
allocate a packet using av_new_packet
and write down if using avio_read
.
static int parse_content_length(char *line)
{
char *tag, *value;
int ret = split_tag_value(&tag, &value, line);
long int val;
if (ret < 0)
return ret;
if (av_strcasecmp(tag, "Content-Length"))
return AVERROR_INVALIDDATA;
val = strtol(value, NULL, 10);
if (val == LONG_MIN || val == LONG_MAX)
return AVERROR(errno);
if (val > INT_MAX)
return AVERROR(ERANGE);
return val;
}
static int mpjpeg_read_packet(AVFormatContext *s, AVPacket *pkt)
{
char line[128];
int ret, size;
ret = get_line(s->pb, line, sizeof(line));
if (ret < 0)
return ret;
ret = check_content_type(line);
if (ret < 0)
return ret;
ret = get_line(s->pb, line, sizeof(line));
if (ret < 0)
return ret;
size = parse_content_length(line);
if (size < 0)
return size;
ret = get_line(s->pb, line, sizeof(line));
if (ret < 0)
goto fail;
ret = av_new_packet(pkt, size);
if (ret < 0)
return ret;
ret = avio_read(s->pb, pkt->data, size);
if (ret < 0)
goto fail;
ret = get_line(s->pb, line, sizeof(line));
if (ret < 0)
goto fail;
// Consume the boundary marker
ret = get_line(s->pb, line, sizeof(line));
if (ret < 0)
goto fail;
return ret;
fail:
av_free_packet(pkt);
return ret;
}
What next
For now I walked you through on the fundamentals, hopefully next week I’ll show you some additional features I’ll need to implement in this simple demuxer to make it land in Libav: AVOptions to make possible overriding the framerate and some additional code to be able to do without Content-Length
and just use the boundary line.
PS: wordpress support for syntax highlight is quite subpar, if somebody has a blog engine that can use pygments or equivalent please tell me and I’d switch to it.