shebang #! interpreter scripts

Many shell script files have " #!/bin/bash " as their first line. This is a manifestation of the "interpreter script" feature. It is described in the man page for the execve system function

   Interpreter scripts
       An interpreter script is a text file that has execute permission enabled and whose  first  line
       is of the form:

           #! interpreter [optional-arg]

       The  interpreter  must  be a valid pathname for an executable which is not itself a script.  If
       the filename argument of execve() specifies an interpreter script,  then  interpreter  will  be
       invoked with the following arguments:

           interpreter [optional-arg] filename arg...

       where arg...  is the series of words pointed to by the argv argument of execve().

#!/bin/bash isn't the only thing you might see. Sometimes it's #!/bin/sh or #!/usr/bin/perl. It is always the name of a program. What programs make sense for this usage? The above man page snippet tells us that script files like this cause the program they name to be invoked, and moreover to be handed as an argument the scriptfile's name. So first of all, it appears that by intent it makes sense that the command ought to be one that's designed to take a file as argument. Many commands do, but many do not. One that does not is sleep. It takes an argument, which is the number of seconds to sleep. But it doesn't take one that is the name of a file and do anything with it. It isn't written so. What happens when you provide a filename (wrongly, deliberately) where it expects a number? Try it to observe the error message:

cd
touch myscript creates file "myscript"
sleep myscript

Now, to appreciate the interpreter script calling mechanism, let's make an interpreter script out of myscript:

echo '#!/bin/sleep' > myscript
chmod +x myscript
./myscript

You get the same error. Running the script constructed the same command you issued manually before, namely sleep with myscript as argument, with the same outcome. Now you know how it works, as a mechanism: the file's first line's command is given the file's name as argument and that is then executed. Giving a filename argument to a command not meant for one is useless, but giving one to a command that is sounds promising. Provided however, that what's in the file (the script file itself) is the right kind of stuff the command is written to expect, process, and require. In other words that it be "legal" as defined by the command.

A command that could take a filename legally is echo. echo doesn't consciously plan to process files, as do file-oriented utilities like rm, mv, cat, and sort for example. But echo does expect to handle text, and since a filename is perfectly good text echo will happily accept and print it. But it won't do anything to the thus-named file, nor even understand that it exists.

echo '#!/bin/echo' > myscript
./myscript

echo isn't limited to taking a single argument, it will take multiple words so run my script with a few of them.

./myscript mercury venus mars

Interpret the result you see in light of the above man page snippet, regarding what happens to "mercury venus mars". Now let's use a file-oriented command, rm:

echo '#!/bin/rm' > myscript
./myscript
./myscript

What happened the second time you ran myscript? Did something change?

These examples with sleep, echo, and rm are just to prove the workings of the interpreter script calling mechanism. A slightly more meaningful "interpreter" program might be cat:

echo '#!/bin/cat' > myscript
seq 1 5 | sed 's/^/line /' >> myscript appends 5 numbered lines to myscript
chmod +x myscript
./myscript

Here cat dumped out the whole myscript file, processing all 6 lines. Real-world programs that appear as script interpreters, like bash most prominently, are written to not process anything that starts with # (that's what "comment" means). cat doesn't have that feature, otherwise it would have only printed 5 lines. So we are narrowing down the field of suitable script interpreters to programs that

- expect to process files
- overlook lines that start with #
-do something deliberate and meaningful with the files' content, as bash does with syntactically correct script files for example

Other programs than bash that could appear would be other shells, so you might see

#!/bin/ksh

#!/bin/tcsh

where the content of the containing files would be korn-dialect or tcsh-dialect shell script commands. And other programs specifically written to be language interpreters are completely suitable so you will see:

#!/usr/bin/perl

#!/usr/bin/python

where the content found after the first line is in perl or python language. You might also see some that are rarer like tcl, expect, emacs. And you should count on seeing and plan to write and use

#!/bin/sed -f

#!/bin/gawk -f

(what is the significance of the -f's? What does a -f mean to sed? to gawk? Find out on their man pages.)

Note that some of the programs lie in /bin, and some in /usr/bin. This is configuration and distribution dependent. It is not a universal convention and cannot be relied upon across all machines and environments. If you take a perl script from your machine, where the perl interpreter resides in /usr/bin, and move it to a machine where perl sits in /usr/local/bin instead, the script won't work.

There is a command, env, that will find and execute a program wherever it might lie in the PATH. In the perl example, instead of having one script with #!/usr/bin/perl and another with #!/usr/local/bin/perl as the first line, you could have a single script that would work on both machines with #!/bin/env perl as the first line (provided env is indeed installed in /bin on both, which is a good bet). Script interpreters can be specified using env in the first line as a portability measure. Convert your script like this:

cat myscript
sed -i 1's=.*bin/\(.*\)=#!/bin/env \1=' myscript
cat myscript

sed performs a search and replace on the first line. The regular expression that specifies the search target says find bin/ somewhere in the first line and save what follows it (the command name). It composes the replacement text as that command, preceded by #!/bin/env . Run the command and make sure it behaves as it did before (i.e., that it behaves now that the interpreter gets discovered the same as it did then the interpreter was hard-coded).

./myscript