This chapter includes:
Shell scripting, at its most basic, is taking a series of commands you might type at a command line and putting them into a file, so you can reproduce them again at a later date, or run them repeatedly without having to type them over again.
You can use scripts to automate repeated tasks, handle complex tasks that might be difficult to do correctly without repeated tries, redoing some of the coding, or both. Such scripts include:
The shell that you'll likely use for scripting under Neutrino is ksh, a public-domain implementation of the Korn shell. The sh command is usually a symbolic link to ksh. For more information about this shell, see:
Neutrino also supplies or uses some other scripting environments:
In general, a shell script is most useful and powerful when working with the execution of programs or modifying files in the context of the filesystem, whereas sed, gawk, and perl are primarily for working with the contents of files. For more information, see:
You can execute a shell script in these ways:
sh myscript
. myscript
chmod 744 myscript ./myscript
In this instance, your shell automatically invokes a new shell to execute the shell script.
The first line of many — if not most — shell scripts is in this form:
#! interpreter [arg]
For example, a Korn shell script likely starts with:
#! /bin/sh
The line starts with a #, which indicates a comment, so the line is ignored by the shell processing this script. The initial two characters, #!, aren't important to the shell, but the loader code in procnto recognizes them as an instruction to load the specified interpreter and pass it:
For example, if your script is called my_script, and you invoke it as:
./my_script my_arg1 my_arg2 ...
then procnto loads:
interpreter [arg] ./my_script my_arg1 my_arg2 ...
|
Some interpreters adjust the list of arguments:
For example, let's look at some simple scripts that echo their own arguments.
Suppose we have a script called ksh_script that looks like this:
#! /bin/sh echo $0 for arg in "$@" ; do echo $arg done
If you invoke it as ./ksh_script one two three, the loader invokes it as /bin/sh ./ksh_script one two three, and then ksh removes itself from the argument list. The output looks like this:
./ksh_script one two three
Next, let's consider the gawk version, gawk_script, which looks like this:
#!/usr/bin/gawk -f BEGIN { for (i = 0; i < ARGC; i++) print ARGV[i] }
The -f argument is important; it tells gawk to read its script from the given file. Without -f, this script wouldn't work as expected.
If you run this script as ./gawk_script one two three, the loader invokes it as /usr/bin/gawk -f ./gawk_script one two three, and then gawk changes its full path to gawk. The output looks like this:
gawk one two three
The perl version of the script, perl_script, looks like this:
#! /usr/bin/perl for ($i = 0; $i <= $#ARGV; $i++) { print "$ARGV[$i]\n"; }
If you invoke it as ./perl_script one two three, the loader invokes it as /usr/bin/perl ./perl_script one two three, and then perl removes itself and the name of the script from the argument list. The output looks like this:
one two three
As a quick tutorial in the Korn shell, let's look at a script that searches C source and header files in the current directory tree for a string passed on the command line:
#!/bin/sh # # tfind: # script to look for strings in various files and dump to less case $# in 1) find . -name '*.[ch]' | xargs grep $1 | less exit 0 # good status esac echo "Use tfind stuff_to_find " echo " where : stuff_to_find = search string " echo " " echo "e.g. tfind console_state looks through all files in " echo " the current directory and below and displays all " echo " instances of console_state." exit 1 # bad status
As described above, the first line identifies the program, /bin/sh, to run to interpret the script. The next few lines are comments that describe what the script does. Then we see:
case $# in 1) ... esac
The case ... in is a shell builtin command, one of the branching structures provided by the Korn shell, and is equivalent to the C switch statement.
The $# is a shell variable. When you refer to a variable in a shell, put a $ before its name to tell the shell that it's a variable rather than a literal string. The shell variable, $#, is a special variable that represents the number of command-line arguments to the script.
The 1) is a possible value for the case, the equivalent of the C case statement. This code checks to see if you've passed exactly one parameter to the shell.
The esac line completes and ends the case statement. Both the if and case commands use the command's name reversed to represent the end of the branching structure.
Inside the case we find:
find . -name '*.[ch]' | xargs grep $1 | less
This line does the bulk of the work, and breaks down into these pieces:
which are joined by the | or pipe character. A pipe is one of the most powerful things in the shell; it takes the output of the program on the left, and makes it the input of the program to its right. The pipe lets you build complex operations from simpler building blocks. For more information, see “Redirecting input and output” in Using the Command Line.
The first piece, find . -name '*.[ch]', uses another powerful and commonly used command. Most filesystems are recursive through a hierarchy of directories, and find is a utility that descends through the hierarchy of directories recursively. In this case, it searches for files that end in either .c or .h — that is, C source or header files — and prints out their names.
The filename wildcards are wrapped in single quotes (') because they're special characters to the shell. Without the quotes, the shell would expand the wildcards in the current directory, but we want find to evaluate them, so we prevent the shell from evaluating them by quoting them. For more information, see “Quoting special characters” in Using the Command Line.
The next piece, xargs grep $1, does a couple of things:
find . -name '*.[ch]' -exec grep $i {} | less
which loads and runs the grep program for every file found. The command that we actually used:
find . -name '*.[ch]' | xargs grep $1 | less
runs grep only when xargs has accumulated enough files to fill a command line, generally resulting in far fewer invocations of grep and a more efficient script.
The final piece, less, is an output pager. The entire command may generate a lot of output that might scroll off the terminal, so less presents this to you a page at a time, with the ability to move backwards and forwards through the data.
The case statement also includes the following after the find command:
exit 0 # good status
This returns a value of 0 from this script. In shell programming, zero means true or success, and anything nonzero means false or failure. (This is the opposite of the meanings in the C language.)
The final block:
echo "Use tfind stuff_to_find " echo " where : stuff_to_find = search string " echo " " echo "e.g. tfind console_state looks through all files in " echo " the current directory and below and displays all " echo " instances of console_state." exit 1 # bad status
is just a bit of help; if you pass incorrect arguments to the script, it prints a description of how to use it, and then returns a failure code.
In general, a script isn't as efficient as a custom-written C or C++ program, because it:
However, developing a script can take less time than writing a program, especially if you use pipes and existing utilities as building blocks in your script.
Here are some things to keep in mind when writing scripts:
chmod a+x script_name
Your script doesn't have to be executable if you plan to invoke it by passing it as a shell argument:
ksh script_name
or if you use it as a “dot file,” like this:
. script_name
~/bin/my_script