Michał Górny

PyPy is back, and for real this time!

As you may recall, I was looking for a dedicated PyPy maintainer for quite some time. Sadly, all the people who helped (and who I’d like to thank a lot) ended up lacking time soon enough. So finally I’ve decided to look into the hacks reducing build-time memory use and take care of the necessary ebuild and packaging work myself.

Continue reading “PyPy is back, and for real this time!”

Password security in network applications

While we have many interesting modern authentication methods, password authentication is still the most popular choice for network applications. It’s simple, it doesn’t require any special hardware, it doesn’t discriminate anyone in particular. It just works™.

The key requirement for maintaining security of a secret-based authentication mechanism is the secrecy of the secret (password). Therefore, it is very important for the designer of network applications regard the safety of password as essential and do their best to protect it.

In particular, the developer can affect the security of password
in three manners:

through the security of server-side key storage,
through the security of the secret transmission,
through encouraging user to follow the best practices.

I will expand on each of them in order.

Continue reading “Password security in network applications”

Bash pitfalls: globbing everywhere!

Bash has many subtle pitfalls, some of them being able to live unnoticed for a very long time. A common example of that kind of pitfall is ubiquitous filename expansion, or globbing. What many script writers forget about to notice is that practically anything that looks like a pattern and is not quoted is subject to globbing, including unquoted variables.

There are two extra snags that add up to this. Firstly, many people forget that not only asterisks (*) and question marks (?) make up patterns — square brackets ([) do that as well. Secondly, by default bash (and POSIX shell) take failed expansions literally. That is, if your glob does not match any file, you may not even know that you are globbing.

It’s all just a matter of running in the proper directory for the result to change. Of course, it’s often unlikely — maybe even close to impossible. You can work towards preventing that by running in a safe directory. But in the end, writing predictable software is a fine quality.

How to notice mistakes?

Bash provides a two major facilities that could help you stop mistakes — shopts nullglob and failglob.

The nullglob option is a good choice for a default for your script. After enabling it, failing filename expansions result in no parameters rather than verbatim pattern itself. This has two important implications.

Firstly, it makes iterating over optional files easy:

for f in a/* b/* c/*; do
    some_magic "${f}"
done

Without nullglob, the above may actually return a/* if no file matches the pattern. For this reason, you would need to add an additional check for existence of file inside the loop. With nullglob, it will just ‘omit’ the unmatched arguments. In fact, if none of the patterns match the loop won’t be run even once.

Secondly, it turns every accidental glob into null. While this isn’t the most friendly warning and in fact it may have very undesired results, you’re more likely to notice that something is going wrong.

The failglob option is better if you can assume you don’t need to match files in its scope. In this case, bash treats every failing filename expansion as a fatal error and terminates execution with an appropriate message.

The main advantage of failglob is that it makes you aware of any mistake before someone hits it the hard way. Of course, assuming that it doesn’t accidentally expand into something already.

There is also a choice of noglob. However, I wouldn’t recommend it since it works around mistakes rather than fixing them, and makes the code rely on a non-standard environment.

Word splitting without globbing

One of the pitfalls I myself noticed lately is the attempt of using unquoted variable substitution to do word splitting. For example:

for i in ${v}; do
    echo "${i}"
done

At a first glance, everything looks fine. ${v} contains a whitespace-separated list of words and we iterate over each word. The pitfall here is that words in ${v} are subject to filename expansion. For example, if a lone asterisk would happen to be there (like v='10 * 4'), you’d actually get all files in the current directory. Unexpected, isn’t it?

I am aware of three solutions that can be used to accomplish word splitting without implicit globbing:

setting shopt -s noglob locally,
setting GLOBIGNORE='*' locally,
using the swiss army knife of read to perform word splitting.

Personally, I dislike the first two since they require set-and-restore magic, and the latter also has the penalty of doing the globbing then discarding the result. Therefore, I will expand on using read:

read -r -d '' -a words <<<"${v}"
for i in "${words[@]}"; do
    echo "${i}"
done

While normally read is used to read from files, we can use the here string syntax of bash to feed the variable into it. The -r option disables backslash escape processing that is undesired here. -d '' causes read to process the whole input and not stop at any delimiter (like newline). -a words causes it to put the split words into array ${words[@]} — and since we know how to safely iterate over an array, the underlying issue is solved.

The Council and the Community

A new Council election is in progress and we have a few candidates. Most of them have written a manifesto. For some of them this is one of the few mails they sent to the public mailing lists recently. For one of them this is the only one. Do we want to elect people who do not participate actively in the Community? Does such election even make sense?

Continue reading “The Council and the Community”

Inlining -march=native for distcc

-march=native is a gcc flag that enables auto-detection of CPU architecture and properties. Not only it allows you to avoid finding the correct value of -march= but also enables instruction sets that do not fit any standard CPU profile and detects the cache sizes.

Sadly, -march=native itself can’t really work well with distcc. Since the detection is performed when compiling, remote gcc invocations would use the architecture of the distcc host rather than the client. Therefore, the resulting executables would be a mix of different architectures used by distcc.

You may also find -march=native a bit opaque. For example, we had multiple bug reports about LLVM failing to build with -march=atom. However, some of the reporters were using -march=native, so we wasn’t able to immediately identify the duplicates.

In this article, I will guide you shortly on replacing -march=native with expanded compiler flags, for the benefit of distcc compatibility and more explicit build logs.

Continue reading “Inlining -march=native for distcc”