How Debuggers Work: Getting and Setting x86 Registers, Part 2: XSAVE

In the previous part of this article, I have described the basic methods of getting and setting the baseline registers of 32-bit and 64-bit x86 CPUs. I have covered General Purpose Registers, baseline Floating-Point Registers and Debug Registers along with their ptrace(2) interface.

In the second part, I would like to discuss the XSAVE family of instructions. I will describe the different variants of this instruction as well as explain the differences between them and their limitations. Afterwards, I will compare the ptrace(2) API used to access its data on Linux, FreeBSD and NetBSD. Other systems such as OpenBSD or DragonFly BSD do not provide requests to retrieve or set extended registers, so the comparison may help them design their own APIs.

Continue reading

How Debuggers Work: Getting and Setting x86 Registers, Part 1

In this article, I would like to shortly describe the methods used to dump and restore the different kinds of registers on 32-bit and 64-bit x86 CPUs. The first part will focus on General Purpose Registers, Debug Registers and Floating-Point Registers up to the XMM registers provided by the SSE extension. I will explain how their values can be obtained via the ptrace(2) interface.

The ptrace(2) API is commonly used in all modern BSD systems and Linux, as all of them derive it from the original form designed and implemented in 4.3BSD. The primary focus in this article is on the FreeBSD and NetBSD systems. Nevertheless, the users of other Operating Systems such as OpenBSD, DragonFly BSD or Linux can still benefit from this article as the basic principles are the same and the code examples are intended to be easily adapted to other platforms.

A single CPU (in modern hardware: CPU core or CPU thread, if hyperthreading is available) can execute only one program thread at a time. In order to be able to run multiple processes and threads quasi-simultaneously, the Operating System must perform context switching — that is periodically suspend the currently running thread, save its state, restore the saved state of another thread and resume it. Saving and restoring the values of the processor’s registers play an important part in context switching. It is important that this process is fully transparent to the process being switched, and in a properly implemented kernel there should be no side effects that are perceptible to the program.

Continue reading

Few notes on locale craziness

Back in the EAPI 6 guide I shortly noted that we have added a sanitization requirement for locales. Having been informed of another locale issue in Python (pre-EAPI 6 ebuild), I have decided to write a short note of locale curiosities that could also serve in reporting issues upstream.

When l10n and i18n are concerned, most of the developers correctly predict that the date and time format, currencies, number formats are going to change. It’s rather hard to find an application that would fail because of changed system date format; however, much easier to find one that does not respect the locale and uses hard-coded format strings for user display. You can find applications that unconditionally use a specific decimal separator but it’s quite rare to find one that chokes itself combining code using hard-coded separator and system routines respecting locales. Some applications rely on English error messages but that’s rather obviously perceived as mistake. However, there are also two hard cases…

Continue reading “Few notes on locale craziness”

Common filesystem I/O pitfalls

Filesystem I/O is one of the key elements of the standard library in many programming languages. Most of them derive it from the interfaces provided by the standard C library, potentially wrapped in some portability and/or OO sugar. Most of them share an impressive set of pitfalls for careless programmers.

In this article, I would like to shortly go over a few more or less common pitfalls that come to my mind.

Continue reading “Common filesystem I/O pitfalls”

A quick note on portable shebangs

While at first shebangs may seem pretty obvious and well supported, there is a number of not-so-well-known portability issues affecting them. Only during my recent development work, I have hit more than one of them. For this reason, I’d like to write a quick note summarizing how to stay on the safe side and keep your scripts working across various systems.

Please note I will only cover the basic solution to the most important portability issues. If you’d like to know more about shebang handling in various systems, I’d like to recommend you an excellent article ‘The #! magic, details about the shebang/hash-bang mechanism on various Unix flavours’ by Sven Mascheck.

Continue reading “A quick note on portable shebangs”