next up previous contents
Next: Deploying scripts at specific Up: Making Things Easier: Batch Previous: Making Things Easier: Batch


Some initial concepts

So far we saw how to perform several tasks, but only one at a time. There are commands to copy files, list them, connect to another machine, control processes and so on. But to automate procedures, we often need a combination of these simple tasks.

In bioinformatics, there are many tasks that have to be repeated constantly. One example is updating local versions of sequence databases. These tasks in general involve running more than one program. Following on our example, one may need to, after downloading the updated version of sequences databases, to run BLAST on sequences that were generated locally, to check if there is any new match. In Linux there is a very easy way to created little ``programs'', the shellscripts that perform many shell commands. It is similar to the old ``.bat'' files of Windows. To create a shellscript you only need to create a file where, in each line, you type the Linux command to be executed. After you finished editing the file, save it, and add ``excecution'' permission to it. Now you file can be run as any normal Linux program. When you run this file, all Linux commands will be performed in the order you specified. In this way, one does not need to wait for a process to terminate in order to issue the next commands.

This kind of execution is called batch processing, since the commands are started in a batch and is very easy to specify. All you have to do is to write a ``script'' in a text file and make it executable. We will see how to do that shortly.

The script is a list of command lines to be executed, one after another, just as if you were typing them on the terminal. There are also some other special commands which allow you to control the execution of these commands, like testing for results or iterating. We sill some of these control commands soon, but for now we will learn how to execute a script.

Before we can actually execute a script, we need first to write one. Open your favorite text editor and write the following commands:

cp -r Test1_ Aux
cd Aux
diff ../Fasta_sample Fasta_last >delta
grep -i cgtta delta

An interesting feature is that you can add comments to your scripts: anything written after a pound sign (#) until the end of line is ignored (you may want to try it in the terminal, just for fun).

# make a backup copy
cp -r Test1_ Aux
#move into work subdirectory
cd Aux
#compare new data with the standard
diff ../Fasta_sample Fasta_last >delta
#and locate the species of interest
grep -i '>Slime'  delta

Save this script in the file myscript. To run it, just call bash and use the file as the argument.

bash myscript

This is already nice, but we can make it better. If we change the permissions of the file myscript to allow execution, using chmod, you do not need to write bash every time:

chmod uo+x myscript
myscript

Now, myscript works like any regular program you have in the system. The reason is that any text file which has execution permission is interpreted by the standard shell -- bash, in our case.

We could say explicitly which interpreter to use in a special comment in the first line of our script. This line must start with these characters: #!, followed by the path of the interpreter. Always use the full path, to avoid security breaches. This feature, called sh-bang, is very useful when you want another program to interpret your script, as we shall see in chapter ???.

Our complete script, with this special line, is the following

#!/bin/bash
# make a backup copy
cp -r Test1_ Aux
#move into work subdirectory
cd Aux
#compare new data with the standard
diff ../Fasta_sample Fasta_last >delta
#and locate the species of interest
grep -i '>Slime'  delta

In the following we will see how to use some special features of bash to do more complex tasks within a script.



Subsections
next up previous contents
Next: Deploying scripts at specific Up: Making Things Easier: Batch Previous: Making Things Easier: Batch
gubi
2006-01-18