requoting in bash

While working with Uberlord on the Gentoo netscripts, I had a chance to review our requoting function. Here it is:

function requote {
        local q=\'
        set -- "${@//\'/$q\'$q}"        # quote inner instances of '
        set -- "${@/#/$q}"              # add ' to start of each param
        set -- "${@/%/$q}"              # add ' to end of each param
        echo "$*"
}

The purpose of this function is to make the arguments suitable for evaluation by the shell. This happens whenever you need a single variable that will correctly evaluate to multiple arguments without distorting the original content. In the case of passing the variable to an external program, you can’t even use bash arrays, so requoting is the only option. Here’s a simple example:

connect=$(requote chat -v '' ATZ OK ATDT318714 CONNECT '' ogin: ppp word: '<pa$$w0rd!>')
pppd connect "$connect"

In this case it’s important that quoting is preserved so that the special characters in the password, including the angle brackets that could be misinterpreted as I/O redirection, are passed to the chat program safely.

Some time after writing this function, I learned about bash printf’s %q, which “means to quote the argument in a way that can be reused as shell input” (from bash built-in help). It turns out it isn’t very easy to update our requote function to use it because, right or wrong, it drops empty arguments…

$ printf "%q " one '' '$<$'; echo
one  \$\<\$ 

This is the best I could come up with for now, which unfortunately needs a bash loop. If somebody comes up with a better implementation using printf %q, I’d be interested in knowing it!

function requote {
    declare arg
    for arg; do 
        arg=$(printf '%q' "$arg")
        printf '%s ' "${arg:-''}"
    done
}

bash default/alternate values

Many people are familiar with the bourne shell constructs for default or alternate values, but there’s a detail that I suspect most people don’t know: the semantic difference between the bash and bourne shell syntax. Here is the summary:

${var-word}    # insert $var, or default value if var is unset
${var+word}    # insert alternate value if var is set
${var:-word}   # insert $var, or default value if var is unset or null
${var:+word}   # insert alternate value if var is set and non-null

Do you see the difference between these? The first two are classic bourne shell syntax, the second two are bash additions (they’re also present in most bourne shell derivatives, including posix shell). The semantic difference is that the first two only check for the existence of the variable, but the second two additionally check that it has non-zero length.

Here is an example of when the older bourne shell syntax is useful. Imagine you have a configuration file in shell syntax, so you’re going to source it into the current shell to pick up the variable settings. Something like this, which happens to work because Gentoo’s /etc/make.conf is valid shell syntax:

source /etc/make.conf

You might like to know if GENTOO_MIRRORS is set by the environment or by the configuration file, even if it was set to a null string. If it is unset after consulting both sources, we’ll call that an error. Here’s one way you could do it:

if [[ -z ${GENTOO_MIRRORS+set} ]]; then
    source /etc/make.conf
fi
if [[ -z ${GENTOO_MIRRORS+set} ]]; then
    echo "Error: GENTOO_MIRRORS not found in environment or /etc/make.conf" >&2
    exit 1
fi