Portability of tar features

The tar format is one of the oldest archive formats in use. It comes as no surprise that it is ugly — built as layers of hacks on the older format versions to overcome their limitations. However, given the POSIX standarization in late 80s and the popularity of GNU tar, you would expect the interoperability problems to be mostly resolved nowadays.

This article is directly inspired by my proof-of-concept work on new binary package format for Gentoo. My original proposal used volume label to provide user- and file(1)-friendly way of distinguish our binary packages. While it is a GNU tar extension, it falls within POSIX ustar implementation-defined file format and you would expect that non-compliant implementations would extract it as regular files. What I did not anticipate is that some implementation reject the whole archive instead.

This naturally raised more questions on how portable various tar formats actually are. To verify that, I have decided to analyze the standards for possible incompatibility dangers and build a suite of test inputs that could be used to check how various implementations cope with that. This article describes those points and provides test results for a number of implementations.

Please note that this article is focused merely on read-wise format compatibility. In other words, it establishes how tar files should be written in order to achieve best probability that it will be read correctly afterwards. It does not investigate what formats the listed tools can write and whether they can correctly create archives using specific features.

Continue reading

One thought on “Portability of tar features”

  1. Hello,

    when I tested tar programs a few years ago, I decided to use bsdtar, not gnu tar.
    I don’t remember the reasons exactly, but one was compression / decompression:
    Bsdtar can be statically linked to use it for emergency backup/restore, *including* the compressor/decompressor, while for most other tar implementations including gnu, separate compression programs are needed, and the compressor I use is not contained on most linux live cd’s
    (and bsdtar at least at that time compressed/decompressed much faster than gnu tar).

    What I noticed during my tests at that time was a severe incompatibility w.r.t. extended attributes: If I remember correctly, star and bsdtar were able to restore each other’s archives including extended attributes, but I was unable to produce a tar file with star or bsdtar (no matter what format options I tried) which could be restored correctly (including ea’s) by gnu tar, and bsdtar was unable to restore ea’s from gnu tar archives (star at least partially understood gnu tar ea’s, but also produced strange messages).

    * How would bsdtar rank in your tests? Does it lack features?
    * Are the incompatibilities to restore each others ea’s between star and gnu tar or between bsdtar and gnu tar now fixed?

Leave a Reply

Your email address will not be published.