Sams Teach Yourself Shell Programming in 24 Hours
(Publisher: Macmillan Computer Publishing)
Author(s): Sriranga Veeraraghavan
ISBN: 0672314819
Publication Date: 01/01/99

Previous Table of Contents Next


As an example of using numeric expressions, look at the following script that counts the number of blank lines in a file:

#!/bin/sh
for i in $@ ;
do
    if [ -f $i ] ; then
        echo $i
        awk ' /^ *$/ { x=x+1 ; print x ; }' $i
    else
        echo "ERROR: $i not a file." >&2
    fi
done

In the awk command, you increment the variable x and print it each time a blank line is encountered. Because a new instance of the awk command runs for each file, the count is unique of each file.

Consider the file urls.txt, which contains four blank lines:

$ cat urls.txt
http://www.cusa.berkeley.edu/~ranga

http://www.cisco.com

ftp://prep.ai.mit.edu/pub/gnu/
ftp://ftp.redhat.com/

http://www.yahoo.com/index.html
ranga@kanchi:/home/ranga/pub

ranga@soda:/home/ranga/docs/book/ch01.doc

For urls.txt, the output of this script looks like the following:

urls.txt
1
2
3
4

There are two important things to keep in mind about numeric expressions:

  If either num1 or num2 is the name of a variable whose value is a string rather than a number, awk uses the value 0 rather than the string.
  If you use a variable that has not yet been created in a numeric expression, awk creates the variable and assigns it a value of 0.

The Assignment Operators In the previous example, the awk command:

awk ' /^ *$/ { x=x+1 ; print x ; }' $i

Uses the assignment:

x=x+1

In awk this can be written in a more concise fashion using the addition assignment operator:

x += 1

In general the assignment operators have the syntax

name operator= num

Here name is the name of a variable, operator is one of the operators specified in Table 17.2, and num is either the name of a variable or a numeric constant such as 1 or 2. A list of the assignment operators is given in Table 17.3.

Table 17.3 Assignment Operators in awk

Operator Description

+= Add
-= Subtract
*= Multiply
/= Divide
%= Modulo (Remainder)
^= Exponentiation

Using an assignment operator is shorthand for writing a numeric expression of the form:

name=name operator num

Many programmers prefer using the assignment operators because they are slightly more concise than a regular numeric expression.

In the case of

x += 1

the assignment operator += takes the value of x, adds 1 to it, and then assigns the result to x.

The Special Patterns: BEGIN and END

In the awk command

awk ' /^ *$/ { x=x+1 ; print x ; }' $i

you print out the value of x each time it is incremented. Thus the output looks like this:

urls.txt
1
2
3
4

It would be much nicer if you could print the total number of empty lines. You can do this by using the special patterns BEGIN and END.

As I stated before, the general syntax of a command in an awk script is

/pattern/ { actions }

Usually pattern is a regular expression, but pattern can also be one of the two special patterns BEGIN and END. When these patterns are used, the general form of an awk command becomes

awk '
    BEGIN { actions }
    /pattern/ { actions }
    /pattern/ { actions }
    END { actions }
' files

The BEGIN pattern must be the first pattern that is specified, and the END pattern must be the last pattern that is specified. Between the BEGIN and END patterns you can have any number of the following pairs:

/pattern/ { action ; }

Both the BEGIN and the END pattern are optional, so

  When the BEGIN pattern is specified, awk executes its actions before reading any input.
  When the END pattern is specified, awk executes its actions before it exits.

If a program consists of only a BEGIN pattern, awk does not read any lines before exiting.

When these patterns are given the execution of an awk, the script is as follows:

1.  If a BEGIN pattern is present, the script executes the actions it specifies
2.  Reads an input line and parses it into fields
3.  Compares each of the specified patterns against the input line, until it finds a match. When it does find a match, the script executes the actions specified for that pattern. This step is repeated for all available patterns.
4.  Repeats steps 2 and 3 while input lines are present
5.  After the script reads all the input lines, if the END pattern is present, it executes the actions that the pattern specifies.

To solve your problem, you can use the END pattern to print out the value of x. The modified script is as follows:

#!/bin/sh
for i in $@ ;
do
    if [ -f "$i" ] ; then
        echo "$i\c"
        awk '
            /^ *$/ { x+=1 ; }
            END { printf " %s\n",x; }
        ' "$i"
    else
        echo "ERROR: $i not a file." >&2
    fi
done

Now the output looks like

urls.txt 4


Previous Table of Contents Next