Why automated gentoo-mirror commits are not signed and how to verify them

Those of you who use my Gentoo repository mirrors may have noticed that the repositories are constructed of original repository commits automatically merged with cache updates. While the original commits are signed (at least in the official Gentoo repository), the automated cache updates and merge commits are not. Why?

Actually, I was wondering about signing them more than once, even discussed it a bit with Kristian. However, each time I decided against it. I was seriously concerned that those automatic signatures would not be able to provide sufficient security level — and could cause the users to believe the commits are authentic even if they were not. I think it would be useful to explain why.

Verifying the original commits

While this may not be entirely clear, by signing the merge commits I would implicitly approve the original commits as well. While this might be worked-around via some kind of policy requesting the developer to perform additional verification, such a policy would be impractical and confusing. Therefore, it only seems reasonable to verify the original commits before signing merges.

The problem with that is that we still do not have an official verification tool for repository commits. There’s the whole Gentoo-keys project that aims to eventually solve the problem but it’s not there yet. Maybe this year’s Summer of Code will change that…

Not having an official verification routines, I would have to implement my own. I’m not saying it would be that hard — but it would always be semi-official, at best. Of course, I could spend a day or two in contributing needed code to Gentoo-keys and preventing some student from getting the $5500 of Google money… but that would be the non-enterprise way of solving the urgent problem.

Protecting the signing key

The other important point is the security of key used to sign commits. For the whole effort to make any sense, it needs to be strongly protected against being compromised. Keeping the key (or even a subkey) unencrypted on the server really diminishes the whole effort (I’m not pointing fingers here!)

Basic rules first. The primary key kept off-line, used to generate signing subkey only. Signing subkey stored encrypted on the server and used via gpg-agent, so that it won’t be kept unencrypted outside the memory. All nice and shiny.

The problem is — this means someone needs to type the password in. Which means there needs to be an interactive bootstrap process. Which means every time server reboots for some reason, or gpg-agent dies, or whatever, the mirrors stop and wait for me to come and type the password in. Hopefully when I’m around some semi-secure device.

Protecting the software

Even all those points considered and solved satisfiably, there’s one more issue: the software. I won’t be running all those scripts in my home. So it’s not just me you have to trust — you have to trust all other people with administrative access to the machine that’s running the scripts, you have to trust the employees of the hosting company that have physical access to the machine.

I mean, any one of them can go and attempt to alter the data somehow. Even if I tried hard, I won’t be able to protect my scripts from this. In the worst case, they are going to add a valid, verified signature to the data that has been altered externally. What’s the value of this signature then?

And this is the exact reason why I don’t do automatic signatures.

How to verify the mirrors then?

So if automatic signatures are not the way, how can you verify the commits on repository mirrors? The answer is not that complex.

As I’ve mentioned, the mirrors use merge commits to combine metadata updates with original repository commits. What’s important is that this preserves the original commits, along with their valid signatures and therefore provides a way to verify them. What’s the use of that?

Well, you can look for the last merge commit to find the matching upstream commit. Then you can use the usual procedure to verify the upstream commit. And then, you can diff it against the mirror HEAD to see that only caches and other metadata have been altered. While this doesn’t guarantee that the alterations are genuine, the danger coming from them is rather small (if any).