Overview of cross-architecture portability problems

Ideally, you’d want your program to work everywhere. Unfortunately, that’s not that simple, even if you’re using high-level “portable” languages such as Python. In this blog post, I’d like to focus on some aspects of cross-architecture problems I’ve seen or heard about during my time in Gentoo. Please note that I don’t mean this to be a comprehensive list of problems — instead, I’m aiming for an interesting read.

What breaks programs on 32-bit systems?

Basic integer type sizes

If you asked anyone what’s the primary difference between 64-bit and 32-bit architectures, they will probably answer that it’s register sizes. For many people, register sizes imply differences in basic integer types, and therefore the primary source of problems on 32-bit architectures, when programs are tested on 64-bit architectures only (which is commonly the case nowadays). Actually, it’s not that simple.

Contrary to common expectations, the differences in basic integer types are minimal. Most importantly, your plain int is 32-bit everywhere. The only type that’s actually different is long — it’s 32-bit on 32-bit architectures, and 64-bit on 64-bit architectures. However, people don’t use long all that often in modern programs, so that’s not very likely to cause issues.

Perhaps some people worry about integer sizes because they still foggily remember the issues from porting old 32-bit software to 64-bit architectures. As I’ve mentioned before, int remained 32-bit — but pointers became 64-bit. As a result, if you attempted to cast pointers (or related data) to int, you’d be in trouble (hence we have size_t, ssize_t, ptrdiff_t). Of course, the same thing (i.e. casting pointers to long) made for 64-bit architectures is ugly but won’t technically cause problems on 32-bit architectures.

Note that I’m talking about System V ABI here. Technically, the POSIX and the C standards don’t specify exact integer sizes, and permit a lot more flexibility (the C standard especially — up to having, say, all the types exactly 32-bit).

Address space size

Now, a more likely problem is the address space limitation. Since pointers are 32-bit on 32-bit architectures, a program can address no more than 4 GiB of memory (in reality, somewhat less than that). What’s really important here is that this limits allocated memory, even it is never actually used.

This can cause curious issues. For example, let’s say that you have a program that allocates a lot of memory, but doesn’t use most of it. If you run this program on a 64-bit system with 2 GiB of total memory, it works just fine. However, if you run it on 32-bit userland with a lot of memory, it fails. And why is that? It’s because the system permitted the program to allocate more memory than it could ever provide — risking an OOM if the program actually tried to use it all; but on the 32-bit architecture, it simply cannot fit all these allocations into 32-bit addresses.

The following sample can trivially demonstrate this:

$ cat > mem-demo.c <<EOF
#include <stdlib.h>
#include <stdio.h>

int main() {
    void *allocs[100];
    int i, j;
    FILE *urandom = fopen("/dev/urandom", "r");

    for (i = 0; i < 100; ++i) {
        allocs[i] = malloc(1024 * 1024 * 1024);
        if (!allocs[i]) {
            printf("malloc for i = %d failed\n", i);
            return 1;
        }
        fread(allocs[i], 1024, 1, urandom);
    }

    for (i = 0; i < 100; ++i)
        free(allocs[i]);
    fclose(urandom);

    return 0;
}
EOF
$ cc -m64 mem-demo.c -o mem-demo && ./mem-demo
$ cc -m32 mem-demo.c -o mem-demo && ./mem-demo 
malloc for i = 3 failed

The program allocates a grand total of 100 GiB of memory, but uses only the first KiB of each allocation. This works just fine on 64-bit architectures but fails on 32-bit because of failing allocation.

At this point, it’s probably worth noting that we are talking about limitations applicable to a single process. A 32-bit kernel can utilize more than 4 GiB of memory, and therefore multiple processes can use a total of more than 4 GiB. There are also cursed ways of making it possible for a single process to access more than 4 GiB of memory. For example, one could use memfd_create() (or equivalently, files on tmpfs) to create in-memory files that exceed process’ address space, or use IPC to exchange data between multiple processes having separate address spaces (thanks to Arsen Arsenović and David Seifert for their hints on this).

Large File Support

Another problem faced by 32-bit programs is that the file-related types are traditionally 32-bit. This has two implications. The more obvious one is that off_t, the type used to express file sized and offsets, is a signed 32-bit integer, so you cannot stat() and therefore open files larger than 2 GiB. The less obvious implication is that ino_t, the type used to express inode numbers, is also 32-bit, so you cannot open files with inode numbers 2^32 and higher. In other words, given large enough filesystem, you may suddenly be unable to open random files, even if they are smaller than 2 GiB.

Now, this is a problem that can be solved. Modern programs usually define _FILE_OFFSET_BITS=64 and get 64-bit types instead. In fact, musl libc unconditionally provides 64-bit types, rendering this problem a relic of the past — and apparently glibc is planning to switch the default in the future as well.

Here’s a trivial demo:

$ cat > lfs-demo.c <<EOF
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

int main() {
    int fd = open("lfs-test", O_RDONLY);

    if (fd == -1) {
        perror("open() failed");
        return 1;
    }

    close(fd);
    return 0;
}
EOF
$ truncate -s 2G lfs-test
$ cc -m64 lfs-demo.c -o lfs-demo && ./lfs-demo
$ cc -m32 lfs-demo.c -o lfs-demo && ./lfs-demo 
open() failed: Value too large for defined data type
$ cc -m32 -D_FILE_OFFSET_BITS=64 lfs-demo.c \
    -o lfs-demo && ./lfs-demo

Unfortunately, while fixing a single package is trivial, a global switch is not. The sizes of off_t and ino_t change, and so effectively does the ABI of any libraries that use these types in the API — i.e. if you rebuild the library without rebuilding the programs using it, they could break in unexpected ways. What you can do is either switch everything simultaneously, or go slowly and add change the types via a new API, preserving the old one for compatibility. The latter is unlikely to happen, given there’s very little interest in 32-bit architecture support these days. The former also isn’t free of issues — technically speaking, you may end up introducing incompatibility with prebuilt software that used the 32-bit types, and effectively lose the ability to run some proprietary software entirely.

time_t and the y2k38 problem

The low-level way of representing timestamps in C is through the number of seconds since the so-called epoch. This number is represented in a time_t type, which, as you can probably guess, was a signed 32 bit integer on 32-bit architectures. This means that it can hold positive values up to 2³¹ – 1 seconds, which roughly corresponds to 68 years. Since the epoch on POSIX systems was defined as 1970, this means that the type can express timestamps up to 2038.

What does this mean in practice? Programs using 32-bit time_t can’t express dates beyond the cutoff 2038 date. If you try to do arithmetic spanning beyond this date (e.g. “20 years from now”), you get an overflow. stat() is going to fail on files with timestamps beyond that point (though, interestingly, open() works on glibc, so it’s not entirely symmetric with the LFS case). Past the overflow date, you get an error even trying to get the current time — and if your program doesn’t account for the possibility of time() failing, it’s going to be forever stuck 1 second before the epoch, or 1969-12-31 23:59:59. Effectively, it may end up hanging randomly (waiting for some wall clock time to pass), not firing events or seeding a PRNG with a constant.

Again, modern glibc versions provide a switch. If you define _TIME_BITS=64 (plus LFS flags, as a prerequisite), your program is going to get a 64-bit time_t. Modern versions of musl libc also default to the 64-bit type (since 1.2.0). Unfortunately, switching to the 64-bit type brings the same risks as switching to LFS globally — or perhaps even worse because time_t seems to be more common in library API than file size-related types were.

These solutions only work for software that is built from source, and uses time_t correctly. Converting timestamps to int will cause overflow bugs. File formats with 32-bit timestamp fields are essentially broken. Most importantly, all proprietary software will remain broken and in need of serious workarounds.

Here are some samples demonstrating the problems. Please note that the first sample assumes the system clock is set beyond 2038.

$ cat > time-test.c <<EOF
#include <stdio.h>
#include <time.h>

int main() {
    time_t t = time(NULL);

    if (t != -1) {
        struct tm *dt = gmtime(&t);
        char out[32];

        strftime(out, sizeof(out), "%F %T", dt);
        printf("%s\n", out);
    } else
        perror("time() failed");

    return 0;
}
EOF
$ cc -m64 time-test.c -o time-test && ./time-test
2060-03-04 11:13:02
$ cc -m32 time-test.c -o time-test && ./time-test
time() failed: Value too large for defined data type
$ cc -m32 -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 \
    time-test.c -o time-test && ./time-test
2060-03-04 11:13:32
$ cat > mtime-test.c <<EOF
#include <fcntl.h>
#include <sys/stat.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>

int main() {
    struct stat st;
    int fd;

    if (stat("time-data", &st) == 0) {
        char buf[32];
        struct tm *tm = gmtime(&st.st_mtime);
        strftime(buf, sizeof(buf), "%F %T", tm);
        printf("mtime: %s\n", buf);
    } else
        perror("stat() failed");

    fd = open("time-data", O_RDONLY);
    if (fd == -1) {
        perror("open() failed");
        return 1;
    }
    close(fd);

    return 0;
}
$ touch -t '206001021112' mtime-data
$ cc -m64 mtime-test.c -o mtime-test && ./mtime-test
mtime: 2060-01-02 10:12:00
$ cc -m32 mtime-test.c -o mtime-test && ./mtime-test
stat() failed: Value too large for defined data type
$ cc -m32 -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 \
    mtime-test.c -o mtime-test && ./mtime-test
mtime: 2060-01-02 10:12:00

Are these problems specific to C?

It is probably worth noting that while portability issues are generally discussed in terms of C, not all of them are specific to C, or to programs directly interacting with C API.

For example, address space limitations affect all programming languages, unless they take special effort to work around them (I’m not aware of any that do). So a Python program will be limited by the 4 GiB of address space the same way C programs are — except that Python programs don’t allocate memory explicitly, so the limit will be rather on memory used than allocated. On the minus side, Python programs will probably be less memory efficient than C programs.

File and time type sizes also sometimes affect programming languages internally. Modern versions of Python are built with Large File Support enabled, so they aren’t limited to 32-bit file sizes and inode numbers. However, they are limited to 32-bit timestamps:

>>> import datetime
>>> datetime.datetime(2060, 1, 1)
datetime.datetime(2060, 1, 1, 0, 0)
>>> datetime.datetime(2060, 1, 1).timestamp()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: timestamp out of range for platform time_t

Other generic issues

Byte order (endianness)

The predominant byte order nowadays is little endian. X86 was always little endian. ARM is bi-endian, but defaults to running little endian (and there were never much incentive to run big endian ARM). PowerPC used to default to big endian, but these days PPC64 systems are mostly running little endian instead.

It’s not that either byte order is superior in some way. It’s just that x86 happened to arbitrarily use that byte order. Given its popularity, a lot of non-portable software has been written that worked correctly on little endian only. Over time, people lost the incentive to run big endian systems and this eventually led to even worse big endian support overall.

The most common issues related to byte order occur when implementing binary data formats, particularly file formats and network protocols. A missing byte order conversion can lead to the program throwing an error or incorrectly reading files written on other platforms, writing incorrect files or failing to communicate with peers on other platforms correctly. In extreme cases, a program that missed some byte order conversions may be unable to read a file it has written before.

Again, byte order problems are not limited to C. For example, the struct module in Python uses explicit byte order, size and alignment modifiers.

Curious enough, byte order issues are not limited to low-level data formats either. To give another example, the UTF-16 and UTF-32 encodings also have little endian and big endian variations. When the user does not request a specific byte order, Python uses host’s byte order and adds a BOM to the string, that is used to detect the correct byte order when decoding.

>>> "foo".encode("UTF-16LE")
b'f\x00o\x00o\x00'
>>> "foo".encode("UTF-16BE")
b'\x00f\x00o\x00o'
>>> "foo".encode("UTF-16")
b'\xff\xfef\x00o\x00o\x00'

char signedness

This is probably one of the most confusing portability problems you may see. Roughly, the problem is that the C standard does not specify the signedness of char type (unlike int). Some platforms define it as signed, others as unsigned. In fact, the standard goes a step further and defines char as a distinct type from both signed char and unsigned char, rather than an alias to either of them.

For example, the System V ABI for x86 and SPARC specifies that char is signed, whereas for MIPS and PowerPC it is unsigned. Assuming either and doing arithmetic on top of that could lead to surprising results on the other set of platforms. In fact, one of the most confusing cases I’ve seen was with code that was used only for big endian platforms, and therefore worked on PowerPC but not on SPARC (even though it would also fail on x86, if it was used there).

Here is an example inspired by it. The underlying idea is to read a little endian 32-bit unsigned integer from a char array:

$ cat > char-sign.c <<EOF
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>

int main() {
        char buf[] = {0x00, 0x40, 0x80, 0xa0};
        char *p = buf;
        uint32_t val = 0;

        val |= (*p++);
        val |= (*p++) << 8;
        val |= (*p++) << 16;
        val |= (*p++) << 24;

        printf("%08" PRIx32 "\n", val);
}
EOF
$ cc -funsigned-char char-sign.c -o char-sign
$ ./char-sign
a0804000
$ cc -fsigned-char char-sign.c -o char-sign
$ ./char-sign
ff804000

Please note that for the sake of demonstration, the example uses -fsigned-char and -funsigned-char switches to override the default platform signedness. In real code, you’d explicitly use unsigned char instead.

Strict alignment

I feel that alignment is not a well-known problem, so perhaps I should start by explaining it a bit. Long story short, alignment is about ensuring that particular types are placed across appropriate memory boundaries. For example, on most platforms 32-bit types are expected to be aligned at 32-bit (= 4 byte) boundaries. In other words, you expect that the type’s memory address would be a multiple of 4 bytes — irrespective of whether it’s on stack or heap, used directly, in an array, a structure or perhaps an array of structures.

Perhaps the simplest way to explain that is to show how the compiler achieves alignment in structures. Please consider the following type:

struct {
    int16_t a;
    int32_t b;
    int16_t c;
}

As you can see, it contains two 2-byte types and one 4-byte type — that would be a total of 8 bytes, right? Nothing more wrong, at least on platforms requiring 32-bit alignment for int32_t. To guarantee that b would be correctly aligned whenever the whole structure is correctly aligned, the compiler needs to move it to an offset being a multiple of 4. Furthermore, to guarantee that if the structure is used in array, every instance is correctly aligned, it also needs to increase its size to a multiple of 4.

Effectively, the resulting structure resembles the following:

struct {
    int16_t a;
    int16_t _pad1;
    int32_t b;
    int16_t c;
    int16_t _pad2;
}

In fact, you can find some libraries actually defining structures with explicit padding. So you get a padding of 2 + 2 bytes, b at offset 4, and a total size of 12 bytes.

Now, what would happen if the alignment requirements weren’t met? On the majority of platforms, misaligned types are still going to work, usually at a performance penalty. However, on some platforms like SPARC, they will actually cause the program to terminate with a SIGBUS. Consider the following example:

$ cat > align-test.c <<EOF
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>

int main() {
	uint8_t buf[6] = {0, 0, 0, 4, 0, 0};
	int32_t *number = (int32_t *) &buf[2];
	printf("%" PRIi32 "\n", *number);
	return 0;
}
EOF
$ cc align-test.c -o align-test
$ ./align-test
1024

The code is meant to resemble a cheap way of reading data from a file, and then getting a 32-bit integer at offset 2. However, on SPARC this code will not work as expected:

$ ./align-test
Bus error (core dumped)

As you can probably guess, there is a fair number of programs suffering from issues like that simply because they don’t crash on x86, and it’s easy to silence the normal compiler warnings (e.g. by type punning, as used it in the example). However, as noted before, this code will not only cause a crash on SPARC — it may also cause a performance penalty everywhere else.

Stack size

As low-level C programmers tend to learn, there are two main kinds of memory available to the program: the heap and the stack. The heap is the main memory area from which explicit allocations are done. The stack is a relatively small area of memory that is given to the program for its immediate use.

The main difference is that the use of heap is controlled — a well-written written program allocates as much memory as it needs, and doesn’t access areas outside of that. On the other hand, stack use is “uncontrolled” — programs generally don’t check stack bounds. As you may guess, this means that if a program uses it too much, it’s going to exceed the available stack — i.e. hit a stack overflow, which generally manifests itself as a “weird” segmentation fault.

And how do you actually use a lot of stack memory? In C, local function variables are kept on stack — so the more variables you use, the more stack you fill. Furthermore, some ABIs use stack to pass function parameters and return values — e.g. x86 (but not the newer amd64 or x32 ABIs). But most importantly, stack frames are used to record the function call history — and this means the deeper you call, the larger the stack use.

This is precisely why programmers are cautioned against recursive algorithms — especially if built without protection against deep recursion, they provide a trivial way to cause a stack overflow. And this last problem is not limited to C — recursive function calls in Python also result in recursive function calls in C. Python comes with a default recursion limit to prevent this from happening. However, as we recently found out the hard way, this limit needs to be adjusted across different architectures and compiler configurations, as their stack frame sizes may differ drastically: from a baseline of 8–16 bytes on common architectures such as x86 or ARM, through 112–128 bytes on PPC64, up to 160–176 bytes on s390x and SPARC64.

On top of that, the default thread stack size varies across the standard C libraries. On glibc, it is usually between 2 MiB and 10 MiB, whereas on musl it is 128 KiB. Therefore, in some cases you may actually need to explicitly request a larger stack.

The wondrous world of floating-point types

x87 math

The x86 platform supports two modes of floating-point arithmetic:

The legacy 387 floating-point arithmetic that utilizes 80-bit precision registers (-mfpmath=387).
The more modern SSE arithmetic that supports all of 32-bit, 64-bit and 80-bit precision types (-mfpmath=sse).

The former is the default on 32-bit x86 platforms using the System V ABI, the latter everywhere else. And why does that matter? Because the former may imply performing some computations using the extended 80-bit precision before converting the result back to the original type, effectively implying a smaller rounding error than performing the same computations on the original type directly.

Consider the following example:

$ cat > float-demo.c <<EOF
#include <stdio.h>

__attribute__((noipa))
double fms(double a, double b, double c) {
	return a * b - c;
}

int main() {
	printf("%+.40f\n", fms(1./3, 1./3, 1./9));
	return 0;
}
EOF
$ cc -mfpmath=sse float-demo.c -o float-demo
$ ./float-demo
+0.0000000000000000000000000000000000000000
$ cc -mfpmath=387 float-demo.c -o float-demo
$ ./float-demo
-0.0000000000000000061663998560113064684174

What’s happening here? The program is computing 1/3 * 1/3 - 1/9, which we know should be zero. Except that it isn’t when using x87 FPU instructions. Why?

Normally, this computation is done in two steps. First, the multiplication 1/3 * 1/3 is done. Afterwards, 1/9 is subtracted from the result. In SSE mode, both steps are done directly on the double type. However, in x87 mode the doubles are converted to 80-bit floats first, both computations are done on these and then the result is converted back to double. We can see that looking at the respective assembly fragments:

$ cc -mfpmath=sse float-demo.c -S -o -
[…]
	movsd	-8(%rbp), %xmm0
	mulsd	-16(%rbp), %xmm0
	subsd	-24(%rbp), %xmm0
[…]
$ cc -mfpmath=387 float-demo.c -S -o -
[…]
	fldl	-8(%rbp)
	fmull	-16(%rbp)
	fsubl	-24(%rbp)
	fstpl	-32(%rbp)
[…]

Now, neither ⅓ nor ⅑ can be precisely expressed in binary system. So 1./3 is actually ⅓ + some error, and 1./9 is ⅑ + another error. It happens that 1./3 * 1./3 after rounding is giving the same value as 1./9 — so subtracting one from the other yields zero. However, when computations are done using an intermediate type of higher precision, the squared error from 1./3 * 1./3 is rounded at a higher precision — and therefore different from the one in 1./9. So counter-intuitively, higher precision here amplifies a rounding error and yields the “incorrect” result!

Of course, this is not that big of a deal — we are talking about 17 decimal places, and user-facing programs will probably round that down to 0. However, this can lead to problems in programs written to expect an exact value — e.g. in test suites.

Gentoo has already switched amd64 multilib profiles to force -mfpmath=sse for 32-bit builds, and it is planning to switch the x86 profiles as well. While this doesn’t solve the underlying issue, it yields more consistent results across different architectures and therefore reduces the risk of our users hitting these bugs. However, this has a surprising downside: some packages actually adapted to expect different results on 32-bit x86, and now fail when SSE arithmetic is used there.

It doesn’t take two architectures to make a rounding problem

Actually, you don’t have to run a program on two different architectures to see rounding problems — different optimization levels, particularly CPU instruction sets can also result in different rounding errors. Let’s try compiling the previous example with and without FMA instructions:

$ cc -mno-fma -O2 float-demo.c -o float-demo
$ ./float-demo
+0.0000000000000000000000000000000000000000
$ cc -mfma -O2 float-demo.c -o float-demo
$ ./float-demo
-0.0000000000000000061679056923619804377437

The first invocation is roughly the same as before. The second one enables use of the FMA instruction set that performs the multiplication and subtraction in one step:

$ cc -mfma -O2 float-demo.c -S -o -
[…]
	vfmsub132sd	%xmm1, %xmm2, %xmm0
[…]

Again, this means that the rounding of the intermediate value is not rounded down to double — and therefore doesn’t carry the same error as 1./9.

Bottom line is this: never match floating-point computation results exactly, allow for some error. Even if something works for you, it may fail not only for a different architecture, but even for different optimization flags. And counter-intuitively, more precise results may amplify errors and yields intuitively “wrong” values.

The long double type

As you can probably guess by now, the C standard doesn’t define precisely what float, double and long double types are. Fortunately, it seems that the first two types are uniformly implemented as, respectively, a single-precision (32-bit) and a double-precision (64-bit) IEEE 754 floating point number. However, as far as the third type is concerned, we might find it to be any of:

the same type as double — on architectures such as 32-bit ARM,
the 80-bit x87 extended precision type — on amd64 and x86,
a type implementing double-double arithmetic — i.e. representing the number as a sum of two double values, giving roughly 106-bit precision, e.g. on PowerPC,
the quadruple precision (128-bit) IEEE 754 type — e.g. on SPARC.

Once again, this is primarily a matter of precision, and therefore it only breaks test suites that assume specific precision for the type. To demonstrate the differences in precision, we can use the following sample program:

#include <stdio.h>

int main() {
	printf("%0.40Lf\n", 1.L/3);
	return 0;
}

Running it across different architectures, we’re going to see:

arm64: 0.3333333333333333333333333333333333172839
ppc64: 0.3333333333333333333333333333333292246828
amd64: 0.3333333333333333333423683514373792036167
arm32: 0.3333333333333333148296162562473909929395

Summary

Portability is no trivial matter, that’s clear. What’s perhaps more surprising is that portability problems aren’t limited to C and similar low-level languages — I have shown multiple examples of how they leak into Python.

Perhaps the most common portability issues these days come from 32-bit architectures. Many projects today are tested only on 64-bit systems, and therefore face regressions on 32-bit platforms. Perhaps surprisingly, most of the issues stem not from incorrect type use in C, but rather from platform limitations — available address space, lack of support for large files or large time_t. All of these limitations apply to non-C programs that are built on C runtime as well, and sometimes require non-trivial fixes. Notably, switching to a 64-bit time_t is going to be a major breaking change (and one that I’ll cover in a separate post).

Other issues may be more obscure, and specific to individual architectures. On PPC64 or SPARC, we hit issues related to big endian byte order. On MIPS and PowerPC, we may be surprised by char being unsigned. On SPARC, we’re going to hit crashes if we don’t align types properly. Again, on PPC64 and SPARC we are also more likely to hit stack overflows. And on i386, we may discover problems due to different precision in floating-point computations.

These are just some examples, and they definitely do not deplete the possible issues. Furthermore, sometimes you may discover a combination of two different problems, furthering your confusion — just like the package that was broken only on big endian systems with signed char.

On the other hand, all these differences provide an interesting opportunity: by testing the package on a bunch of architectures and knowing their characteristics, you can guess what could be wrong with it. Say, if it fails on PPC64 but passes on PPC64LE, you may guess it’s a byte order issue — and then it turns out it was actually a stack overflow, because big endian PPC64 happens to default to ELFv1 ABI that uses slightly larger stack frames. But hey, usually it does help.

Portability is important. The problematic architectures may constitute a tiny portion of your user base — in fact, sometimes I do wonder if some of the programs we’re fixing are actually going to be used by any real user of these architectures, or if we’re merely cargo culting keywords added a long time ago. You may even argue that it’s better for the environment if people discarded these machines rather than kept having them burn energy. However, portability makes for good code. What may seem like bothering for a tiny minority today, may turn out to prevent a major security incident for all your users tomorrow.

One thought on “Overview of cross-architecture portability problems”

Patrick says:

2024-09-25 at 22:14

Thanks a lot for this in-depth post, Michał! I don’t have any exotic (i.e. not amd64) hardware around, but I really appreciate the (renewed) effort of Gentoo to provide them with working packages.