Bourne shell idioms
Here are some portable Bourne shell idioms
that I find useful to remember for scripting.
The Bourne shell does much more than most
users realize, and the ksh and bash
extensions are rarely essential. (From the
command-line, bash and ksh are vastly more
useful.)
My favorite reference book is "Portable Shell
Programming --- An Extensive Collection of
Bourne Shell Examples" by Bruce Blinn from
Prentice Hall.
Get information on a built-in bash command
with help
. It's much easier than reading
the full bash man page at
http://www.gnu.org/software/bash/manual/bash.html
For Bash suggestions, I recommend this bash FAQ:
http://mywiki.wooledge.org/BashFAQ/
§ Text filtering commands
Administrating from scripts and the
command-line often benefit from pipes of text
filtering commands. Here are some that are
easy to overlook or forget.
-
mmencode
converts to and from
base64 and "quoted-printable" formats for email.
Search for the metamail
package.
Unfortunately, this has become hard to find.
Alternatively uuencode -m
converts to base64,
and uudecode -m
converts from base64.
Or decode and encode quoted-printable and
base64 with
perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::decode($_);'
perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::encode($_);'
perl -pe 'use MIME::Base64; $_=MIME::Base64::encode($_);'
perl -pe 'use MIME::Base64; $_=MIME::Base64::decode($_);'
|
URL-encode a string with
perl -ne 'chomp; s/([^-_.~A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg; print "$_\n"'
|
- Convert utf-8 characters to escaped hexadecimal for html, and back:
perl -C -pe 's/([^\x00-\x7f])/sprintf("%d;", ord($1))/ge;'
perl -C -pe 's/&\#(\d+);/chr($1)/ge;s/&\#x([a-fA-F\d]+);/chr(hex($1))/ge;'
|
-
uniq
lets you remove duplicated lines
from a sorted file.
- Count the number of times a given line
occurs with
sort | uniq -c | sort -n
sort | uniq -c | sort -k1,1nr -k2
|
- Break one word per line with
- Combine separate lines into a single line
of words with
- Add up numbers that arrive one per line
-
comm
lets you suppress lines unique to
one or both of two files.
-
cat -s
never prints more than one blank
line in a row.
- Remove all blank lines with
- Print lines starting with one containing
FOO and ending with one containing BAR.
- Print lines other than those starting with one containing
FOO and ending with one containing BAR.
-
diff3 -m
for merging changes in files edited from a common ancestor.
-
fold
breaks lines to proper width, and
fmt
will reformat lines into paragraphs.
-
dirname
and basename
let you
extract the directory and filenames from a
full path to a file.
-
namei
breaks a pathname into pieces and
follows symbolic links.
-
expand
and col -x
replace tabs by
spaces.
-
col -b
removes backspaces from a file.
-
cat -v
shows non-printing characters as
ascii escapes.
-
sed '1,10d'
deletes the first 10
lines.
-
sed -n '3p'
and sed -n '3{p;q}'
both print the third line, but the latter is
more efficient.
-
sed '/foo/q'
truncates a file after
the line containing foo
.
-
sed -ne '/foo/,/bar/p'
prints
everything from the line containing foo
to the line containing bar
.
- Align space-delimited fields into orderly
columns with
column -t
.
- Right justify queries with
printf "%40s" "Do you want to delete? [y/N] "
|
- Convert dos text files to unix, and vice
versa:
dos2unix file.txt
unix2dos file.txt
tr -d \\r < win.txt > unix.txt # if you can't find dos2unix
sed -e 's/$/\r/' < unix.txt > win.txt # if you can't find unix2dos
|
-
cat -n
and nl
numbers lines.
- Both of these perform string substitution,
but the latter allows more general regular
expressions:
sed -e 's/oldtext/newtext/g'
perl -pe 's/oldtext/newtext/g'
|
Here's how to replace double quotes by single
quotes for TeX:
< in.tex perl -pne 's%\B"\b%``%g' |
perl -pne "s%\b\"\B%''%g" > out.tex
|
- Use
iconv
to convert between character
encodings.
- Here are two ways to find string patterns (regular expressions)
in a file:
grep 'pattern' filename [file] [< file]
perl -ne 'print if /pattern/' [file] [< file]
|
- Print the first and third columns of each
line:
awk '{print $1,$3}'
perl -lane 'print "$F[0] $F[2]"'
while read a b c d ; do echo "$a $c" ; done
|
- Convert to lower-case:
tr '[A-Z]' '[a-z]'
tr '[:upper:]' '[:lower:]'
perl -pe 'tr/[A-Z]/[a-z]/'
perl -pe '$_ =lc'
|
- Simple character substitutions and
deletions may be simplest with
tr
.
tr -d '\r' # delete carriage returns
tr '\n' '\0' # replace newlines by null characters.
|
$ echo 1-2a-3b | tr "[1-9]" "[2-9]" | tr '-' '_' | tr -d 'a'
2_3_4b
|
- You can pipe into a loop with
read -r
.
Here is a complicated way to cat a text file,
piping in and out of a loop.
cat file | while read -r a; do echo "$a" ; done | cat
|
- To read lines in pairs from two files try
paste file1 file2 | while read -r a b ; do echo "$a $b" ; done
|
- Divide words one per line, then sum them as numbers:
$ echo 1 2 3.1 |
perl -pe 's/\s+/\n/g' |
perl -e '$s=0; while (<>) {$s += $_;} ; print "$s\n";'
6.1
|
- Reverse lines with
tac
and words with
rev
.
- Sort a list of dependencies with
tsort
.
- Shuffle lines randomly with
shuf
. Generate shuffled integers with
$ shuf -i1-100 -n3
93
57
71
|
- Generate random lottery numbers between 1 and 292201338:
$ echo "($RANDOM + 32768*($RANDOM + 32768*$RANDOM)) % 292201338 + 1" | bc
130237776
|
§ Files and directories
- Select text (non-binary) files with
one of these
\ls | perl -lne 'print if -T'
perl -le 'for (glob "*") {print if -T }'
perl -le 'print for grep -T, <*>'
|
The perl algorithm for detecting text files
is very good.
- To do something to files with goofy names,
including spaces and dashes, delimit the
files with null characters instead of
whitespace or newlines.
find . -type f -print0 | xargs -r0 ls
|
Or read from one line at a time:
cd "$dir1" && find . -type f |
while read -r f ; do
if [ ! -f "$dir2/$f" ] ; then
echo "$f is in $dir2 but not in $dir2"
fi
done
|
- See if a directory contains any files, including broken links.
has_files() {
set -- "$1"/.[!.]* "$1"/*; test -e "$1" || test -e "$2" || test -L "$1" || test -L "$2";
}
if has_files ${dir} ; then echo "${dir} has files" ; else echo "${dir} is empty" ; fi
|
See if files of a certain type exist:
if [ "`printf '%s' *.par`" != '*.par' ] ; then echo "has par files" ; fi
[or]
test "`printf '%s' *.par`" != '*.par' && echo "has pars" || echo "no pars"
|
-
readlink -f
will fully resolve what a
symbolic link points to.
Find all bad symbolic links with
find . -type l |
while read -r f ; do if ! readlink -f "$f" >&/dev/null
then echo "$f" ; fi ; done
|
- To see the canonical path for the current directory, you can use either of these:
§ Variables
- To see if a variable contains a regular
expression, combine
if
and grep
. For
example to see if the name of a file begins
with a dot, try
if echo "$filename" | grep '^[.]' >/dev/null
then echo yes ; else echo no ; fi
|
expr
also has a support for limited
regular expressions.
if [ `expr "$filename" : '[.].*'` -ne 0 ]
then echo yes ; else echo no ; fi
|
- Use
read -r
to avoid tokenizing filenames
with spaces. Here's how to find all files
containing a space, and replace them by
underscores.
find . -iname '* *' |
while read -r f ; do
echo mv "$f" "`echo "$f" | sed 's/ */_/g'`"
done
|
- For simple integer arithmetic use
expr
:
- For arbitrary-precision floating-point
math, use
bc -l
# Get pi to 10 places with arctangent (bc man page)
PI=`echo "scale=10; 4*a(1)" | bc -l`
# Expensive calculation of zero (Craig Artley):
ZERO=`echo "c($PI/4)-sqrt(2)/2" | bc -l`
|
-
seq 1 100
generates all integers
between 1 and 100. To iterate a loop 100
times, try
for i in `seq 1 100` ; do ... ; done
|
- You can set the environment of a subprocess
by defining a variable on the same line. The
current shell is not affected.
$ x=doggie sh -c 'echo x=$x'
x=doggie
$ x=pig ; x=doggie echo x=$x
x=pig
|
- Test that a string has non-zero length with
if [ -n "$string" ] ; then echo "not empty" ; fi
|
The -n
is actually the default for a
string expression, so you can omit it:
if [ "$string" ] ; then echo "not empty" ; fi
|
- There are several good ways to set default
values for environmental variables. Many do
this
if [ ! "$VARIABLE" ] ; then VARIABLE="default value" ; fi
export VARIABLE
|
A simple alternative is
: ${VARIABLE:="default value"}
export VARIABLE
|
The colon at the beginning of the line is
necessary as a no-op that allows its
arguments to be evaluated.
- Rarely you may want to accept a variable
defined as an empty string. If so, then omit
the colon before the equals when setting the
default.
: ${VARIABLE="default value"}
export VARIABLE
|
To test whether a string is defined, even if
empty, test
if [ "${VARIABLE+x}" ] ; then echo DEFINED ; fi
|
- To echo all variables starting with X:
echo ${!X*}
- To check whether a series of variables are defined, try
for V in JAVA_HOME SSH_AGENT_PID TEXMFDIR NETHACKOPTIONS ; do
eval v="\$$V"
if [ ! "$v" ] ; then echo "You must define $V" ; fi
done
|
§ Running commands
- Use
"$@"
when passing command-line
arguments unaltered to subprocesses. This is
equivalent to passing "$1" "$2" ...
, but
the first version works properly for no
arguments.
- Test the processing of arguments, like this
$ set a 'b c' d
$ for i in "$@" ; do echo "|$i|" ; done
|a|
|b c|
|d|
$ for i in "$*" ; do echo "|$i|" ; done
|a b c d|
$ for i in $* ; do echo "|$i|" ; done
|a|
|b|
|c|
|d|
|
- See what runtime options you may have set with these
set -o; bind -p; shopt -p; stty -a
|
For example, you can edit a bash command by default
in emacs mode. Change to vi with
In emacs mode, you can edit your command in
your environmental $EDITOR
with cntl-x
cntl-e
In vi-mode, use esc-v
. See help fc
for more.
- Repeat the last argument of the previous
command with
!$
. Repeat all arguments
without the command with !*
.
- To guarantee that a background process
outlives the current shell, add extra
parentheses like this:
Otherwise, your current shell, by exiting X
or ssh, may terminate all processes that have
your shell as the parent process. The extra
parentheses starts a subshell that exits as
soon as the command is spawned in the
background. The background process changes
its parent process ID to 1. This is a
command-line version of the "double fork."
- Repeat until a command succeeds:
while ! cvs -z 3 -q update -dPA ; do echo -n . ; sleep 60 ; done
|
- Make a progress bar (loop while waiting on
a process)
sleep 10 & while ps -p $! >/dev/null; do echo -n . ; sleep 1 ; done ; echo
or
while pidof mozilla-bin > /dev/null ; do echo -n . ; sleep 1 ; done ; echo
|
pgrep -f
or killall -0
are alternatives to pidof
for this purpose.
§ Manipulating paths
- Loop over the elements of a PATH by
tokenizing with the character ':'.
IFS=':' ; for dir in $PATH ; do echo $dir ; done
|
- Check for the existence of an executable
version of a command in your PATH:
function checkPath() {IFS=':' ; for dir in $PATH ; do if [ -x "$dir/$1" ] ;
then return 0; fi ; done; return 1;}
if checkPath commandName ; then ... ; fi
|
- Here is my prefered way to modify a PATH
# Arguments currentpath newelement [after]
# addtopath a:b c -> c:a:b
# addtopath a:b c after -> a:b:c
# addtopath a:b a -> a:b
addtopath () {
P=$1
E=$2
O=$3
if [ ! "$P" ] ; then
P="$E"
elif ! echo $P | egrep "(^|:)$E($|:)" >/dev/null ; then
if [ "$O" = "after" ] ; then
P="$P:$E"
else
P="$E:$P"
fi
fi
echo "$P"
}
# example
PATH=`addtopath "$PATH" /usr/local/bin after`
|
§ Common script chores
- Debug the script with
set -x
.
- Make a script exit immediately after any failed
command with
set -e
.
- Process flags in a script:
for i in "$@" ; do
case $i in
-a) FLAG_A=1
shift ;;
-b) FLAG_B="$2"
shift ; shift ;;
--) shift ; break ;;
esac
done
|
- Print help from a script:
if [ $# -lt 1 -o "$1" = "-h" -o "$1" = "-help" -o "$1" = "--help" ] ; then
cat <<-END
Usage: `basename $0` [-flag] arg1 [arg2]
More information.
END
exit
fi
|
- Handle errors with functions:
Often an error exit is handled most cleanly
with a function.
print_usage_and_exit() {
cat <<-END
Usage: `basename $0` arg1 arg2 [arg3]
The first two arguments are required.
END
exit
}
if [ $# -lt 2 ] ; then
print_usage_and_exit
fi
|
- Here's a robust way to locate the directory
containing a script, following symbolic
links. (Taken from the launch script of
FindBugs
.)
program="$0"
while [ -h "$program" ]; do
link=`ls -ld "$program"`
link=`expr "$link" : '.*-> \(.*\)'`
if [ "`expr "$link" : '/.*'`" = 0 ]; then
dir=`dirname "$program"`
program="$dir/$link"
else
program="$link"
fi
done
script_directory=`dirname $program`
script_directory=`cd $script_directory && /bin/pwd`
|
- Trapping signals to stop scripts:
Ever try to interrupt your script, then
discover that it killed only one command and
continued to the next? Force a complete exit
by adding the following line early in your
script.
You can also trap normal and error exits:
# force script to exit when any command fails
set -e
# Trap on any exit
trap "echo Always called before exit" 0
# Trap on error exit only
trap "echo Error exit was called " ERR
echo "Next command will fail"
# Returns error code of 1
false
echo "Will not see this comment"
|
- Process ID's
Get the process ID of the current shell as
$$
, of the parent shell with $PPID
and $!
for the most recently backgrounded
child process.
Interactively, you get see child PID's
with jobs -p
.
- Here's how to ask a yes or no question,
with a default of no. It checks whether the
first letter is a y or Y and ignores leading
spaces.
echo -n "Do you want to continue? [y/N]: "
read answer
if expr "$answer" : ' *[yY].*' > /dev/null; then
echo Continuing
else
echo Quitting
exit
fi
|
- Here's how to ask for a password without
echoing the characters. The trapping ensures
that an interrupt does not leave the echoing
off.
stty -echo
trap "stty echo ; echo 'Interrupted' ; exit 1" 1 2 3 15
echo -n "Enter password: "
read password
echo "Your password is \"$password\""
stty echo
|
Gnome and other frameworks often allow simple
scripting of GUIs:
password=`zenity --entry --text "Enter password:"`
|
§ File descriptors
- Redirecting output file descriptors
Here are common ways to capture the
standard output and standard error
of a single command in a log file:
command >file.log 2>&1
command 2>&1 | tee file.log
|
- If you have a script with many commands,
you can have them all write to the same log
file by default:
# save default standard output in file descriptor 10
exec 10>&1
# redirect standard output to a log file.
exec >file.log
# redirect standard error to same log file
exec 2>&1
# close stdin
exec 0<&-
# This command will write to log file
command
# echo to default standard output instead of log file
echo "Visible message" 1>&10
|
Avoid file descriptor 5, which bash already
uses. (ulimit -n
should show many
available file descriptors.)
- Avoid writing to stdout if it is not connected to a terminal:
test -t 1 && echo "Connected to a terminal"
|
- Open a socket
Associate a file descriptor, say 4, with a
socket, and close with
4< /dev/tcp/$hostname/$port
4<&-
|
A more portable solution is to use nc
.
Listen on a port with
Connect to a remote host port like
echo 'GET /' | nc hostname 80
|
An even more general utility is socat
,
which also handles Unix sockets.
Hostname lookups on linux
General utilities are dig
, nslookup
, host
, hostname
.
Get an IP address for a specific hostname:
host samplehostname | sed 's/.* //'
|
Get a hostname for an IP address:
nslookup 123.123.123.123 | grep 'name = ' | sed 's/.*name = //'
|
Return to parent directory.