Patches and Plaid

This is part of the better tools series.

Sometimes you should question the tools you are using and try to see if there is something better out there. Or build it yourself.

Juggling patches

It is quite common when interacting with people to send back and forth the changes to the shared codebase you are working on.

This post tries to analyze two commonly used models and explain why they can be improved and which are the good tools for it (existing or not).

The two models

The focus is on git, github-like web-mediated pull-requests and mailinglist-oriented workflows.

The tools in use are always:

  • a web browser
  • an editor
  • a shell
  • an email client

Some people might have all in one in a way or another making one of the two model already incredibly more effective. Below I assume you do not have such tightly integrated environments.

Pull requests

Github made quite easy to propose patches in the form of ephemeral branches that can be reviewed and merged with a single click on your browser.

The patchset can be part of your master tree or a brand new branch pushed on your repository for this purpose: first you push your changes on github and then you go to your browser to send the PullRequest (also known as merge request or proposed changeset).

You can get email notification that a pull request is available and then move to your browser to review it.

You might have a continuous integration report out of it and if you trust it you may skip fetching the changes and test them locally.

If something does not work exactly as it should you can notify the proponents and they might get an email that they have comments and they have to go to the browser to see them in detail.

Then the changes have to be pushed to the right branch and github helpfully updates it.

Then the reviewer has to get back to the browser and check again.

Once that is done you have your main tree with lots of merge artifacts and possibly some fun time if you want to bisect the history.

Mailing-list mediated

The mailing-list mediated is sort of popular because Linux does use it and git does provide tools for it out of box.

Once you have a set of patches (say 5) you are happy with you can simply issue

git send-email --compose -5 --to the_mailing@list.org

And if you have a local mailer working that’s it.

If you do not you end up having to configure it (e.g. configuring gmail with a specific access token not to have to type the password all the time is sort of easy)

The people in the mailing-list then receive your set in their mailbox as is and they can use git-am to test it (first saving the thread using their email client then using git am over it) locally and push to something like oracle if they like the set but they aren’t completely sure it won’t break everything.

If they have comments can just reply to the specific patch email (using the email Message-Id).

The proponent can then rework the set (maybe using git rebase -i) and send an update and add some comments here and there.

git send-email --annotate -6 --to the_mailing@list.org

Updates to specific patches or rework from other people can happen by just sending the patch back.

git send-email --annotate -1 --in-reply-to patch-msgid

Once the set is good, it can be applied to the tree, resulting in a purely linear history that makes going over looking for regression pretty easy.

Where to improve

Pull request based

The weak and the strong point of this method is its web-centricity.

It works quite nicely if you just use the web-mail so is just switching from a tab to another to see exactly what’s going on and reply in detail.

Yet, if your browser isn’t your shell (and you didn’t configure custom actions to auto-fetch the pull requests) you still have lots of back and forth.

Having already continuous integration hooks you can quickly configure is quite nice if the project has already a solid regression and code coverage harness so the reviewer bourden to make sure the code doesn’t break is lighter.

Sending a link to a pull request is easy.

Sadly, new code does not come with tests or tests you should trust the whole point above is half moot: you have to do the whole fetch&test dance.

Reworking sets isn’t exactly perfect, it makes quite hard to a third party to provide input in form of an alternate patch over a set:

  • you have to fetch the code being discussed
  • prepare a new pull request
  • reference it in your comment to the old one

then

  • the initial proponent has to fetch it
  • rebase his branch on it
  • update the pull request accordingly

and so on.

There are desktop-tools trying to bridge web and shell but right now they aren’t an incredible improvement and the churn during the review can be higher on the other side.

Surely is really HARD to forget a pull request open.

Mailing list based

The strong point of the approach is that you have less steps for the most common actions:

  • sending a set is a single command
  • fetching a set is two commands
  • doing a quick review does not require to switch to another application, you just
    reply to the email you received.
  • sending an update or a different approach is always the same git send-email command

It is quite loose so people can have various degrees of integration, but in general the experience as reviewer is as good as your email client, your experience as proponent is as nice as your sendmail configuration.

People with basic email client would even have problems referring to patches by its Message-Id.

The weakest point of the method is the chance of missing a patch, leaving it either unreviewed or uncommitted after the review.

Ideal situation

My ideal solution would include:

  • Not many compulsory steps, sending a patch for a habitual contributor should take the least amount of time.

  • A pre-screening of patches, ideally making sure the new code has tests and it passes them on some testing environments.

  • Reviewing should take the least amount of time.

  • A mean to track patches and make easy to know if a set is still pending review or it is committed.

Enters plaid

I do enjoy better using the mailing-list approach since it is much quicker for me, I have a decent email client (that still could improve) and I know how to configure my local smtp. If I want to contribute to a new project that uses the approach it is just a matter to find the email address and type git send-email --annotate --to email, github gets unwieldy if I just want to send a couple of fixes.

That said I do see that the mailing-list shortcomings are a limiting factor and while I’m not much concerned as making the initial setup much easier (since federico has already plans for it), I do want to not lose patches and to get some of the nice and nifty features github has without losing the speed in development I do enjoy.

Plaid is my try to improve the situation, right now it is just more or less an easier to deploy patch tracker along the lines of patchwork with a diverging focus.

It emphasizes the concepts of patch tag to provide quick grouping, patch series to ease reviewing a set.

curl http://plaid.libav.org/project/libav/series/50/mbox | git am -s

Is all you need to get all the patches in your working tree.

Right now it works either as stand-alone tracker (right now this test deploy is fed by fetching from the mailing list archives) or as mailbox hook (as patchwork does).

Coming soon

I plan to make it act as postfix filter, so it injects in the email an useful link to the patch. It will provide a mean to send emails directly from it so it can doubles as nicer email client for those that are more web-centric and gets annoyed because gmail and the likes aren’t good for the purpose.

More views such as a per-submitter view and a search view will appear as well.

One thought on “Patches and Plaid”

Leave a Reply to Kostya Cancel reply

Your email address will not be published.