June 3, 2019
At AppNexus, we use Bash to do some important things, and whenever you have a system that does important things, it’s critical to get logging right so that you can find out what happened when something goes wrong.
The strategy we use to log from Bash is outlined in this blog post. To summarize, we redirect STDOUT
and STDERR
in our Bash scripts to write into [logger](<http://man7.org/linux/man-pages/man1/logger.1.html>)
, with various options enabled. logger
then sends our messages to syslog (as well as the terminal), and with some syslog configuration we make sure that the messages are routed into a centralized file.
The code to achieve this looks like this:
#!/bin/bash
exec 1> >(/usr/bin/logger -s -t "$(basename $0)")
echo "Hi, I'm using syslog!"
Output:
test.sh: Hi, I'm using syslog!
This is all well and good, and we’ve written many Bash scripts that use this approach to great success. However, there is an edge case to consider when doing this that we encountered recently and that I want to point out because it 1) can result in serious bugs, and 2) provides an opportunity to learn more about how Bash handles redirection and signals!
It is a common pattern in our Bash scripts to trigger some cleanup work if the script exits for some reason (be it because an error was encountered, the script was forcibly killed, or we simply got to the end). To implement this, we use [trap](<http://man7.org/linux/man-pages/man1/trap.1p.html>)
to set up a signal handler for whichever signals we consider important to intercept.
In this example, we set up a handler for the INT
signal, so that if we are killed during execution, we execute the cleanup
function:
#!/bin/bash
exec 1> >(/usr/bin/logger -s -t "$(basename $0)")
cleanup() {
# Do something really important
echo "Done with cleanup!"
exit 1
}
trap cleanup INT
echo "Hi, I'm using syslog!"
while true; do
# Wait indefinitely by running sleep in the background. Reason to run in
# the background is because when Bash is executing an external command in
# the foreground, it does not handle any signals received until the
# foreground process terminates.
# See <http://mywiki.wooledge.org/SignalTrap#When_is_the_signal_handled.3F>
# for more details.
sleep 10 &
wait $!
done
When we run the script, we should see:
$ ./test.sh
test.sh: Hi, I'm using syslog!
Now, in another shell, let’s find our process using ps
:
$ ps afx
...
28242 pts/18 Ss 0:00 \\_ -bash
31316 pts/18 S+ 0:00 \\_ /bin/bash ./test.sh
31317 pts/18 S+ 0:00 \\_ /bin/bash ./test.sh
31321 pts/18 S+ 0:00 | \\_ /usr/bin/logger -s -t test.sh
31369 pts/18 S+ 0:00 \\_ sleep 10
There are a couple of PIDs here, and it’s worth going into detail about what each one of them are:
28242
is the Bash shell from which we ran test.sh
(our program).31316
is test.sh
itself.31369
is the sleep
command that we run in the background that allows us to wait indefinitely for a SIGINT
.