Shell Programming

This section describes the fundamentals of bash shell programming and covers the following topics:

Creating and Running Shell Programs

Shell programs, or scripts, are just text files that contain one or more shell commands. Shell scripts are similar to batch files in DOS, but are much more powerful. These scripts can be used to simplify repetitive tasks, to replace two or more commands that are always executed together with a single command, to automate the installation of other programs, to write simple interactive applications, and various other useful tasks. The general Linux philosophy for providing functionality is to link several small, discrete commands together to accomplish more complicated tasks. Shell scripting and piping are the most common ways to do this.

Since bash is for most intents and purposes the standard shell for Linux, only bash shell programming will be discussed here. In bash, the pound symbol (#) signifies a comment to be ignored when it is the first character on a line. Some advanced shell programming topics such as built in commands, shell expansions, job control commands, and arithmatic evaluation are not going to be covered here. More information on these topics can be found in the bash man page.

Example Script: remount

Assume you have a CD-ROM drive mounted on your Linux system. This CD-ROM device is usually mounted when the system is first started. If you later want to change the CD in the drive, you must unmount the drive, replace the CD, and then remount the drive. Instead of typing these commands each time you change the CD in your drive, you could create a shell program that would execute both of these commands for you. The following bash script named remount accomplishes this:

# bash script to make changing a CD easier

#!/bin/bash
umount /dev/cdrom
echo -n "Press the enter key to continue."
read
mount -t iso9660 /dev/cdrom /cdrom

The first line is a comment. Comments are a good thing. The second line is a special comment, the only comment that bash does not ignore. A "#!" is a convention that all shells, including bash, understand to indicate that 1) this is a shell script and 2) what kind of shell script it is. This line ensures that even if the user isn't running a bash shell, the script will be executed with bash. The third line is a standard Linux command to unmount the CD-ROM drive. The fourth line displays a prompt. The fifth line is built in bash shell command to read a line from the standard input. The last line mounts the CD-ROM drive.

Once the script exists, there are several ways to execute it. One way to accomplish this is to make the file executable. This is done by entering the command "chmod +x remount". This command changes the permissions of the file so that it is now executable. You can now run your new shell program by typing remount on the command line. The remount shell program must be in a directory that is in your search path, or the shell will not be able to find the program to execute. For security reasons your home directory is not in the path. If you created the script in a directory that is not in your search path, you must prepend a "./" to the beginning of the script's name. This is specifies a relative path to the script of "right here".

Another way you can execute the shell program is to run the shell that the program was written for and pass the program in as a parameter to the shell. This done by entering the command "bash remount". This command starts up a new shell and tells it to execute the commands that are found in the remount file.

A third way of executing the commands in a shell program file is to use the . command. This command tells the shell to execute all the commands in the file that is passed as an argument to the command. For example, the command ". remount" can be used to tell bash to execute the commands in the remount file.

Using Variables

As is the case with almost any language, the use of variables is very important in shell programs. You saw some of the ways in which shell variables can be used in the introductory shell sections. Two of the variables that were introduced were the PATH variable and the PS1 variable. These are examples of built-in shell variables, or variables that are defined by the shell program you are using. This section describes how you can create your own variables and use them in simple shell programs.

Assigning a Value to a Variable

You can assign a value to a variable simply by typing the variable name followed by an equal sign and the value you want to assign to the variable. For example, if you wanted to assign a value of 5 to the variable count, you would enter the following command "count=5". With the bash syntax for setting a variable, you must make sure that there are no spaces on either side of the equal sign. Notice that you do not have to declare the variable as you would if you were programming in C or Pascal. This is because the shell language is a non-typed interpretive language. This means that you can use the same variable to store character strings that you use to store integers. You would store a character string into a variable in the same way that you stored the integer into a variable. For example "name=Garry " is also completely valid.

Accessing the Value of a Variable

Once you have stored a value into a variable, how do you get the value back out? You do this in the shell by preceding the variable name with a dollar sign ($). If you wanted to print the value stored in the count variable to the screen, you would do so by entering the command "echo -n $count". If you omitted the $ from the preceding command, the echo command would display the word count on-screen. Also, here "-n" is an option for echo not to append a carriage return to the end of the line.

Positional Parameters and Other Built-In Shell Variables

The shell has knowledge of a special kind of variable called a positional parameter. Positional parameters are used to refer to the parameters that were passed to a shell program on the command line or a shell function by the shell script that invoked the function. When you run a shell program that requires or supports a number of command-line options, each of these options is stored into a positional parameter. The first parameter is stored into a variable named 1, the second parameter is stored into a variable named 2, and so forth. These variable names are reserved by the shell so that you can't use them as variables you define. To access the values stored in these variables, you must precede the variable name with a dollar sign ($) just as you do with variables you define.

Example Script: reverse

The following shell program expects to be invoked with two parameters. The program takes the two parameters and prints the second parameter that was typed on the command line first and the first parameter that was typed on the command line second.

#program reverse, prints the command line parameters out in reverse order

#!/bin/bash
echo "$2" "$1"

Table 1: Built-in shell variables.

Variable Use
$# Stores the number of command-line arguments that were passed to the shell program.
$? Stores the exit value of the last command that was executed.
$0 Stores the first word of the entered command (the name of the shell program).
$* Stores all the arguments that were entered on the command line ($1 $2 ...).
"$@" Stores all the arguments that were entered on the command line, individually quoted ("$1" "$2" ...).

The Importance of Quotation Marks

The use of the different types of quotation marks is very important in shell programming. Both kinds of quotation marks and the backslash character are used by the shell to perform different functions. The double quotation marks (""), the single quotation marks (''), and the backslash (\) are all used to hide special characters from the shell. Each of these methods hides varying degrees of special characters from the shell. Remember that everything in this section also applies to the bash command line, so for example you could use a backslash to use a space in the name of a file.

Double Quotes

The double quotation marks are the least powerful of the three methods. When you surround characters with double quotes, all the whitespace characters are hidden from the shell, but all other special characters are still interpreted by the shell. This type of quoting is most useful when you are assigning strings that contain more than one word to a variable. For example, if you wanted to assign the string hello there to the variable greeting, you would type the following command:

greeting="Hello there"

This command would store the string "Hello there" in the variable "greeting" as one word. If you typed this command without using the quotes, you would not get the results you wanted. bash would not understand the command and would return an error message.

Single Quotes

Single quotes are the most powerful form of quoting. They hide all special characters from the shell. This is useful if the command that you enter is intended for a program other than the shell. Because the single quotes are the most powerful, you could have written the previous example using single quotes. You might not always want to do this. If the string being assigned to the greeting variable contained another variable, you would have to use the double quotes. For example, if you wanted to include the name of the user in your greeting, you would type the following command:

greeting="Hello there $LOGNAME" 

This would store the string "Hello there " and the value of $LOGNAME into the variable greeting. The LOGNAME variable is a shell variable that contains the username of the person who is logged in to the system. If you tried to write this command using single quotes it wouldn't work, because the single quotes would hide the dollar sign from the shell and the shell wouldn't know that it was supposed to perform a variable substitution.

Backslash

Using the backslash is the third way of hiding special characters from the shell. Like the single quotation mark method, the backslash hides all special characters from the shell, but it can hide only one character at a time, as opposed to groups of characters. You could rewrite the greeting example using the backslash instead of double quotation marks by using the following command:

greeting=Hello\ There

In this command, the backslash hides the space character from the shell, and the string "Hello there" is assigned to the variable "greeting".

Backslash quoting is used most often when you want to hide only a single character from the shell. This is usually done when you want to include a special character in a string. For example, if you wanted to store the price of a box of computer disks into a variable named disk_price, you would use the following command:

disk_price=\$5.00

The backslash in this example would hide the dollar sign from the shell. If the backslash were not there, the shell would try to find a variable named 5 and perform a variable substitution on that variable. Assuming that no variable named 5 were defined, the shell would assign a value of .00 to the disk_price variable. This is because the shell would substitute a value of null for the $5 variable. The disk_price example could also have used single quotes to hide the dollar sign from the shell.

Back Quotes

The back quote marks (") perform a different function. They are used when you want to use the results of a command in another command. For example, if you wanted to set the value of the variable contents equal to the list of files in the current directory, you would type the following command:

contents='ls'

This command would execute the ls command and store the results of the command into the contents variable. As you will see in later, this feature can be very useful when you want to write a shell program that performs some action on the results of another command.

The test Command

A command called test is used to evaluate conditional expressions. You would typically use the test command to evaluate a condition that is used in a conditional statement or to evaluate the entrance or exit criteria for an iteration statement. The test command has the following syntax:

test expression
or
[ expression ]

Several built-in operators can be used with the test command. These operators can be classified into four groups: integer operators, string operators, file operators, and logical operators.

Table 2: The test command's integer operators.

Operator Meaning
int1 -eq int2 Returns True if int1 is equal to int2.
int1 -ge int2 Returns True if int1 is greater than or equal to int2.
int1 -gt int2 Returns True if int1 is greater than int2.
int1 -le int2 Returns True if int1 is less than or equal to int2.
int1 -lt int2 Returns True if int1 is less than int2.
int1 -ne int2 Returns True if int1 is not equal to int2.

Table 3: The test command's string operators.

Operator Meaning
str1 = str2 Returns True if str1 is identical to str2.
str1 != str2 Returns True if str1 is not identical to str2.
str Returns True if str is not null.
-n str Returns True if the length of str is greater than zero.
-z str Returns True if the length of str is equal to zero.

Table 4: The test command's file operators.

Operator Meaning
-d filename Returns True if file, filename is a directory.
-f filename Returns True if file, filename is an ordinary file.
-r filename Returns True if file, filename can be read by the process.
-s filename Returns True if file, filename has a nonzero length.
-w filename Returns True if file, filename can be written by the process.
-x filename Returns True if file, filename is executable.

Table 5: The test command's logical operators.

Command Meaning
! expr Returns True if expr is not true.
expr1 -a expr2 Returns True if expr1 and expr2 are true.
expr1 -o expr2 Returns True if expr1 or expr2 is true.

Conditional Statements

The bash shell has two forms of conditional statements. These are the if statement and the case statement. These statements are used to execute different parts of your shell program depending on whether certain conditions are true.

The if Statement

bash supports nested if...then...else statements. These statements provide you with a way of performing complicated conditional tests in your shell programs. The syntax of the if statement is shown here:

if [ expression ]
then
   commands
elif [ expression2 ]
then
   commands
else
   commands
fi

The elif and else clauses are both optional parts of the if statement. Also note that bash use the reverse of the statement name in most of their complex statements to signal the end of the statement. In this statement the fi keyword is used to signal the end of the if statement. The elif statement is an abbreviation of else if. This statement is executed only if none of the expressions associated with the if statement or any elif statements before it were true. The commands associated with the else statement are executed only if none of the expressions associated with the if statement or any of the elif statements were true.

The case Statement

The case statement enables you to compare a pattern with several other patterns and execute a block of code if a match is found. The shell case statement is quite a bit more powerful than the case statement in Pascal or the switch statement in C. This is because in the shell case statement you can compare strings with wildcard characters in them, whereas with the Pascal and C equivalents you can compare only enumerated types or integer values. The syntax for the case statement is the following:

case string1 in
str1)
   commands;;
str2)
   commands;;
*)
   commands;;
esac

The string string1 is compared to str1 and str2. If one of these strings matches string1, the commands up until the double semicolon (;;) are executed. If neither str1 nor str2 matches string1, the commands associated with the asterisk are executed. This is the default case condition because the asterisk matches all strings.

The following code is an example of a bash case statement. This code checks to see if the first command-line option was -i or -e. If it was -i, the program counts the number of lines in the file specified by the second command-line option that begins with the letter i. If the first option was -e, the program counts the number of lines in the file specified by the second command-line option that begins with the letter e. If the first command-line option was not -i or -e, the program prints a brief error message to the screen.

case $1 in
-i)
   count='grep ^i $2 | wc -l'
   echo "The number of lines in $2 that start with an i is $count"
   ;;
-e)
   count='grep ^e $2 | wc -l'
   echo "The number of lines in $2 that start with an e is $count"
   ;;
*)
   echo "That option is not recognized"
   ;;
esac

Iteration Statements

The shell languages also provide several iteration or looping statements.

The for Statement

The for statement executes the commands that are contained within it a specified number of times. bash has two variations of the for statement. The first form of the for statement that bash support has the following syntax:

for var1 in list
do
   commands
done

In this form, the for statement executes once for each item in the list. This list can be a variable that contains several words separated by spaces, or it can be a list of values that is typed directly into the statement. Each time through the loop, the variable var1 is assigned the current item in the list, until the last one is reached. The second form of for statement has the following syntax:

for var1
do
   statements
done

In this form, the for statement executes once for each item in the variable var1. When this syntax of the for statement is used, the shell program assumes that the var1 variable contains all the positional parameters that were passed in to the shell program on the command line. Typically this form of for statement is the equivalent of writing the following for statement:

for var1 in "$@"
do
   statements
done

The following is an example of the for statement. This example takes as command-line options any number of text files. The program reads in each of these files, converts all the letters to uppercase, and then stores the results in a file of the same name but with a .caps extension.

for file
do
   tr a-z A-Z < $file >$file.caps
done

The while Statement

Another iteration statement offered by the shell programming language is the while statement. This statement causes a block of code to be executed while a provided conditional expression is true. The syntax for the while statement is the following:

while expression
do
   statements
done

The following is an example of the while statement. This program lists the parameters that were passed to the program, along with the parameter number.

count=1
while [ -n "$*" ]
do
   echo "This is parameter number $count $1"
   shift
   count='expr $count + 1'
done

As you will see later the shift command moves the command-line parameters over one to the left.

The until Statement

The until statement is very similar in syntax and function to the while statement. The only real difference between the two is that the until statement executes its code block while its conditional expression is false, and the while statement executes its code block while its conditional expression is true. The syntax for the until statement is:

until expression
do
   commands
done

The same example that was used for the while statement can be used for the until statement. All you have to do to make it work is negate the condition. This is shown in the following code:

count=1
until [ -z "$*" ]
do
   echo "This is parameter number $count $1"
   shift
   count='expr $count + 1'
done

The only difference between this example and the while statement example is that the -n test command option (which means that the string has nonzero length) was removed, and the -z test option (which means that the string has zero length) was put in its place. In practice the until statement is not very useful, because any until statement you write can also be written as a while statement.

The shift Command

The shift command moves the current values stored in the positional parameters to the left one position. For example, if the values of the current positional parameters are:

$1 = -r 
$2 = file1 
$3 = file2

and you executed the shift command the resulting positional parameters would be as follows:

$1 = file1 
$2 = file2

You can also move the positional parameters over more than one place by specifying a number with the shift command. This is a very useful command when you have a shell program that needs to parse command-line options. This is true because options are typically preceded by a hyphen and a letter that indicates what the option is to be used for. Because options are usually processed in a loop of some kind, you often want to skip to the next positional parameter once you have identified which option should be coming next. For example, the following shell program expects two command-line options, one that specifies an input file and one that specifies an output file. The program reads the input file, translates all the characters in the input file into uppercase, then stores the results in the specified output file.

while [ "$1" ]
do
   if [ "$1" = "-i" ]; then
      infile="$2"
      shift 2
   elif [ "$1" = "-o" ] 
   then
      outfile="$2"
      shift 2
   else
      echo "Program $0 does not recognize option $1"
   fi
done

tr a-z A-Z < $infile > $outfile

Functions

The shell languages enable you to define your own functions. These functions behave in much the same way as functions you define in C or other programming languages. The main advantage of using functions as opposed to writing all of your shell code in line is for organizational purposes. Code written using functions tends to be much easier to read and maintain and also tends to be smaller, because you can group common code into functions instead of putting it everywhere it is needed.

The syntax for creating a function in is the following:

fname () {
   shell commands
}

Once you have defined your function using one of these forms, you can invoke it by entering the following command:

fname [parm1 parm2 parm3 ...]

Notice that you can pass any number of parameters to your function. When you do pass parameters to a function, it sees those parameters as positional parameters, just as a shell program does when you pass it parameters on the command line.

The following shell program contains several functions, each of which is performing a task associated with one of the command-line options. This example illustrates many of the topics covered in this section. It reads all the files that are passed on the command line and—depending on the option that was used—writes the files out in all uppercase letters, writes the files out in all lowercase letters, or prints the files.

upper () {
   shift

   for i
   do
      tr a-z A-Z < $1 > $1.out
      rm $1
      mv $1.out $1
      shift
   done;
}

lower () {
   shift

   for i
   do
      tr A-Z a-z < $1 > $1.out
      rm $1
      mv $1.out $1
      shift
   done;
}

print () {
   shift

   for i
   do
      lpr $1
      shift
   done;
}

usage_error () {
   echo "$1 syntax is $1 <option> <input files>"
   echo ""
   echo "where option is one of the following"
   echo "p to print frame files"
   echo "u to save as uppercase"
   echo "l to save as lowercase"
; }

case $1
in
   p | -p) print $@;;
   u | -u) upper $@;;
   l | -l) lower $@;;
   *) usage_error $0;;
esac
;}

Summary

As you become familiar with using Linux, you will find that you use shell programming languages more and more often. Even though the shell languages are very powerful and also quite easy to learn, you might run into some situations where shell programs are not suited to the problem you are solving. In these cases you may want to investigate the possibility of using one of the other languages available under Linux. Some of your options include C, C++, gawk, and Perl.