Scripting in Unix


Declaring and Using Variables

Set a variable by setting <variable> = <value>. Display or access the variable by using a $ sign, ie. $<variable>. K-Shell and Bash allow arrays. Arrays are defined by setting values into the indices of the array, ie. <array>[<subscript>]=<value>. In K-Shell set array elements as set -A <variable> <value1> <value2> ... <valuen>. In Bash sets array values as <variable>=(<value1> ... <valuen>). Access array values as ${<variable>[<subscript>]}. Also all items in an array can be accessed as ${<variable>[*]} or ${<variable>[@]}. Variables can also be declared as read only by setting the variable and then specifying the variable as being read only, ie. <variable>=<value>; readonly <variable>. Variables can be unset (removed) using the unset <variable> command. Note that simply setting a variable to a null value does not actually remove the variable, ie. <variable>= simply sets the value to null without removing the local variable. Note that a local variable is only available to the current shell (process). By exporting local variables, ie. <variable>=<value>; export <variable>, the local variable will be declared into the environment and thus be available to the current process (shell) and all child processes (shells). For instance, the PATH variable would be set and then exported. In K-Shell and Bash variables are exported as export <variable>=<value>.

Built-In Variables

Substitution

Pattern Matching

Wildcards can be used to match multiple filenames, ie. command prefix.* will execute command on all files named based on prefix.*. Wildcards can be placed anywhere within a filename match string, as a prefix or a suffix. The * character will find multiple characters and the ? character will match a single character. Note that matching with Unix is case sensitive. Exact characters can be matched and even ranges of characters can be matched as in [0-9]. Matching can also be done as negation as in command [!<value>]*.

Substitution Based on Variable State

Command Substitution

Command substitution allows submission of the output of a command (STDOUT) into the input (STDIN) of another command. Command substitution is performed by placing the command between funky Unix quotes, ie. `command`. For instance, DATE=`date` will set the current system date into the variable DATE. Basically command substitution is the placing of the result of the execution of a command into a variable or into another process. Some examples are shown below.

Arithmetic Substitution

Arithmetic substitution allows the evaluation of an arithmetic expression from the command line prompt. Note that arithmetic substitution only works on K-Shell (/bin/ksh) and Bash (/bin/bash) using the syntax $((expression)). Allowed operators are /, *, -, + and () for precedence.

Quoting

Quoting in Unix is basically escaping of reserved characters or enclosing of strings of characters in quotes in order to prevent misinterpretation of a non-executable string of characters. A single reserved character is escaped using the backslash (\) character. Reserved shell characters are *, ?, [,] , ', ", \, $, ;, &, (, ), |, ^, <,> plus newline(\n), space and tab(\t) characters. Thus to print the string Me & you to STDOUT one would have to type Me \& you or "Me & you". Even though newlines are invisible, when typing in a command at the command line the \ character can be used to continue a command on the following line.

Where the backslash character can be used to escape (not interpret) single characters, enclosing a string with single quotes (') can disengage interpretation of all special characters within a whole string. What is your salary in $'s would become What is your salary in \$\'s \? using the backslash option or simply be enclosed in quotes, ie. 'What is your salary in $\'s ?'.

Using single quotes completely eliminates any interpretation of special characters within a string. However, use of double-quoted strings allows interpretation of a number of characters; most importantly the $ character to allow for access to variables and ` characters allowing command substitution.

Note that quoting can occur anywhere in a string, enslosing all or parts of the string.

Controlling Flow

Different shells have different syntax for various types of flow control functionality.

The If Statement

[ condition ] The test expression condition

/bin/sh

if condition1; then action1; elif condition2; then action2; else action3; fi

/bin/csh

if (condition1) then
	action1
else if (condition2) then
	action2
else
	action3
endif

/bin/ksh

if condition1; then action1; elif condition2; then action2; else action3 ; fi

The test expression test Command

The test command can be used match filenames plus do string and numerical comparisons. For instance, if [ -z "$TEST" ]; ... will be true if the variable TEST is null. File test options are as listed below.

String comparisions are -z for zero length, -n for non-zero length and string1 [!]= string2 for string [in]equality. Numerical comparisons are denoted as [ integer1 operator integer2 ]. Numerical comparison operators are -eq, -ne, -lt, -le, -gt and -ge. Compound expressions can be handled using the conditional operators, and (&&) and or (||). Another option is use of special compound expression built-in operators. These are ! expression, expression1 -a expression2 (&&) and expression1 -o expression2 (||). Thus [ expression1 ] && [ expression2 ] becomes [ expression1 -a expression2 ].

#!/bin/sh

if [ `whoami` != 'oracle' ]; then
        echo Aborted - user `whoami` is incorrect, must be user oracle
        exit 1

elif [ -z "$1" ] || [ -z "$2" ] || [ -z "$3" ] || [ -z "$4" ] || [ -z "$5" ] || [ -z "$6" ]; then
        echo "$USAGE"
        exit 1

elif [ -z "$PATH" ] || [ -z "ORACLE_BASE" ] || [ -z "ORACLE_HOME" ] || [ -z "TNS_ADMIN" ] || [ -z "ORACLE_SID" ] || [ -z "ORACLE_DBF" ] || [ -z "ORACLE_SBIN" ] || [ -z "ORACLE_UTILS" ] || [ -z "ORACLE_BACKUP" ] || [ -z "ORACLE_RESTORE" ]; then
        echo Variable not defined
        exit 1

else
	...
fi

The Case Statement

/bin/sh

case word in [ pattern [ | pattern ] ) actions ;; ] ... esac

/bin/csh

switch (expression)
	case comparison1:
		actions
		breaksw
	case comparison2:
		actions
		breaksw
	default:
endsw

/bin/ksh

case word in [ pattern [ | pattern ] ) actions ;; ] ... esac
select identifier [ in word ... ] ; do list ; done

The While and Until Loops

/bin/sh

while [ conditions ]; do actions  ; done
until [ conditions ]; do actions  ; done

/bin/csh

while (conditions)
	# do actions
end

/bin/ksh

while [ conditions ]; do actions  ; done
until [ conditions ]; do actions  ; done

The example below can be used to validate user input.

ANSWER=
while [ -z "$ANSWER" ];
do
	echo "Enter your name : "
	read ANSWER
done

The For Loop

/bin/sh

for word [ in wordlist...  ] ; do actions ; done

/bin/csh

foreach word (wordlist)
	...
end

repeat count command

/bin/ksh

for word [ in wordlist ... ] ; do actions ; done
for j in a b c d e f g
do
	echo $j
done

The break and continue Commands

The break command is used to break out of a loop, resuming processing at the line immediately following the last line of the loop. The continue command will exit the current iteration of a loop.

Options and Parameters Passed into Scripts

Options are passed into a script with a preceeding - (minus) sign. Parameters are passed in as space separated strings; strings containing spaces must be enclosed in double quotes. Options can be handled using a case statement of the getopts command. In addition to passing options and parameters into scripts there are a number of specialised variables with special functions. In general parameters are supplied as variable substitions to a script and options change the behaviour of a script.

Input and Output

Two methods of printing to the screen (STDOUT) are use of the echo command and the printf format arguments commands. Strings can use quoting as already explained. Special characters such as \n (newline), \t (tab) and \c (no newline) can be included by using escaping. Simple formatting for the printf command is exactly as occurs in C. The example below shows sinple use of the echo command.

#!/bin/sh
for j in *;
do
	if [ -d $j ]; then echo "Directory $j"
	elif [ -h $j ]; then echo "Link $j"
	elif [ -f $j ]; then echo "File $j"
	fi
done

Output can be redirected from STDOUT to a file or into STDIN from a file. A single < or > will overwrite, ie. command > file or command < file and two will append, ie. command >> file. Output can also be redirected from STDOUT into the STDIN of another command using a pipe (|) command. For instance, df -k | grep swap | grep -v grep will show available swap space capacity.

User input can be handled using the read command as shown in the example below where a file is read line by line from redirection into the while loop.

while read STRING
do
	...
done < file

Whenever a command is executed three file handles are opened for that command execution. These file handles are STDIN, STDOUT and STDERR; typically responding to file descriptors of 0, 1 and 2 respectively. These file handles can be accessed by use of their file descriptors. All these file descriptors can be redirected to other files or from other files in the case of STDIN.

The /dev/null descriptor will discard STDOUT and STDERR output.

Using Functions

Functions can not be used in C-Shell. A function has the format of name () { command; ... }. Shell functions can be used to replace binaries or shell built-ins of the same name.

cd () { chdir ${1:-$HOME} ; PS1="`pwd`$ "; export PS1; }
list () { ls -la; }

The example below checks for the existence of all paths in the directory. Note how the local variable is unset after completion of the loop and note that the local variable is named in lowercase, an underscore character as the first character is sometimes used.

PATH=
for dir in $PATH;
do
	if [ -d "$dir" ];
		echo "$dir ok"
	fi
done

Functions can be placed into libraries. These libraries can be included into script files by executing those functions within those script files. Note that function library files can only contain function definitions.

#!/bin/sh
#
#This is the function library
#
error () { echo "Error : " $@ >&2; }
warning () { echo "Warning : " $@ $@ >&2; }
email (subject,recipients,message)
{
	if [ -z "$message" ]; then
		mailx -s $subject $recipients < /dev/null
	else
		mailx -s $subject $recipients < $message
	fi
}

#!/bin/sh
#
#This is the scripting calling functions within the function library
#
./utilities.sh	#Include the function library
...

Filtering Text

Text filtering can be executed with general Unix utilities, regular expressions, awk and sed).

General Text Filtering Utilities

The utilities head, tail, grep, sort, uniq and tr are all basic text filtering utilities.

awk and sed

sed is a stream editor, awk is a pattern matcher or simple programming language. These are common uses of these two utilities. Both sed and awk are executed as command 'script' files | STDIN. Both sed and awk can be used to match regular expressions or patterns to the contents of input. Perl pattern matching tends to function in a similar fashion to that of sed and awk. General meta-characters used for pattern matching are shown below.

Some pattern matching examples are shown below.

The sed Stream Editor

Patterns can be applied to files using sed where a particular action can be performed on the file content based on the matching results of those patterns in the form s/pattern/change/g where the g causes a global change to the input. p or d in the place of g would print or delete input lines respectively without changing the original input. sed can also be used to perform multiple updates as in sed -e 'command' -e 'command' ... -e 'command' files. sed could also be used to parse input from STDIN and display partial strings of STDIN input, much the same way as grep and awk would perform the same functionality. Personally I prefer grep and awk or even Perl.

Pattern Matching with awk

Pattern matching in awk works the same way as in sed. awk simply has more functionality as a imple programming language. sed is an editor. awk is used for parsing the lines in a text file and taking actions on those lines. awk has very C-like syntax. awk allows if, while and for statements for flow control. awk also allows variable declarations (variable=value) plus passing in of shell variables into awk scripts (awk 'script' var=val var=val ... files. See an example in Unix for Oracle under Disk Space and File Management). There is not really much point in going through the syntax of awk in this document since awk syntax is very simplistic. Typically awk in it's most simple form is used to parse files or STDIN and pull specific columns from the output as shown below.

# df -k | awk '{print $1 " " $5}'
Filesystem capacity
/proc 0%
/dev/dsk/c0t0d0s0 84%
fd 0%
swap 1%

Other Useful Unix Utilities

The eval command can be used to process a command line twice. For instance, with the variable REDIRECT set to > file.out the command echo cat file.in $REDIRECT would not be executed but simply would send the text cat file.in > file.out to STDOUT, ie. the screen. In order to execute the command use the eval command as in eval echo cat file.in $REDIRECT.

The : command simply does nothing.

The type command gives the full path name of a Unix command, ie. type command1 command2 ... commandn.

The sleep n command pauses processing for n seconds.

The find command can be used to list files recursively through directories where those filenames match specified criteria. For instance, find all core-dump files on a machine using find ./ -name "core" -print. Using the -print option will restrict printing to the screen by excluding errors produced by non-accessible directories due to restrictive permissions. The format of the find command is find start-directory options actions. The -type f|d|b|c|l|p option allows specification of file types to find, ie. f, d, b, c, l or p (file, directory, block device, character device, link or named pipe). For instance, find / -type d -print finds directories only. The -size [+|-]n option finds only files of less than, greater than or equal to a specified number of blocks. For instance, find / -size +1000 -print finds all files greater than 1000 blocks in size. The find / [-mtime | -atime | -ctime] [+|-]n -print allows finding of files as per last modified (-mtime), last accessed (-atime) or last changed (-ctime). n determines more than, equal to or fewer than number of days from the current date. The -exec option allows execution of a Unix command on any file found by the find command. For example, find / -name "core" -exec rm -f {} \; will delete all core files recursively in the current directory. Be very careful using the -exec option with the find command, especially when executing something like an rm -f command. The results can be very upsetting.

The xargs command is used to provide a list of words from STDIN as arguments to another command, ie. ps -ef | grep ora_ | grep -v grep | xargs kill -9.

The expr command allows simple integer arithmetic, ie. expr 5 \* 12 echo's a result of 60. Available operators are +, - \* (escaped), / and % (modulus). expr can be used in shell scripts to increment variables, eg. VAR=`exp $VAR+1`.

The bc command will perform floating-point arithmetic and is not limited to integers as the expr command is.

The rsh (remote shell) command allows execution of a command from a remote machine, ie. run a command on another machine from the machine one is currently working on. ssh (secure shell) is a similar command but more secure by virtue of it's name and due to encryption and decryption between source and target machines.

Using Signals in Scripts

Signals are sent as interruptions (interrupts) to a script or program on the occurence of an event. Signals can be both detected, trapped and even ignored. Signals are listed in /usr/include/sys/signal.h.

Signals are trapped using the trap <name> <signals> command. For example, trap "exit 1" 9 will trap all kill signals and exit 1 producing an error. Signals can be ignored simply by telling the signal trapping command to do nothing when the signal is trapped, ie. trap '' <signals> or trap : <signals>.

Debugging Scripts

Scripts can be debugged with the options as shown below using the command /bin/sh option script arguments.

Also the set command can be used within scripts to switch tracing on and off as in the script shown below. set [-n|-v|-x] will switch on debugging and set [+n|+v|+x] will switch off debugging. The set - command will switch off all debugging modes.

#!/bin/sh
set -x; ...; set +x