Google Analytics


To search for specific articles you can use advanced Google features. Go to and enter "" before your search terms, e.g. CSS selectors

will search for "CSS selectors" but only on my site.

Thursday, July 12, 2007

Bourne shell scripting made easy

Someone was having trouble writing a shell script. A common activity for Bourne shell scripting is to take the output from various commands and use it as the input for other commands. Case in point, we have a server that monitors clients. Whenever we get new monitoring software we have to use the server command line tool to install the cartridge on the server, create the agents for each client, deploy the agents, configure them and activate them.

The general steps are:

1) get a list of the cartridges (ls)
2) using a tool, install them (
3) using the same, tool get a list of agents
4) using the tool, get a list of clients
5) using the tool, for each client create an instance of each agent
6) using the tool, for each agent created deploy to the client
7) using the tool, configure the agents
8) using the tool, activate the agents

Just looking at the first two steps, if I was doing this by hand I would use ls to get a list of all the cartridges. I would then cut and paste the cartridge named into a command to install them.

So a Bourne shell script should just cut the same things out of the ls list.

If the cartridge files all end with the extension .cart I can use:
ls -1 *.cart

If the command to install a cartridge was:
./ --install_cart [cartridge_name]

I could use:
for c in `ls -1 *.cart`; do
./ --install_cart $c

This is pretty easy and straight forward. What if the command was not as clean as ls? What is the list of agents was something like:
./ --list_agents
OS: Linux, Level: 2.4, Version: 3.8, Name: Disk
OS: Linux, Level: 2.4, Version: 3.8, Name: Kernel
OS: Windows, Level: 5.1, Version: 3.8, Name: System

To install the agent I only need the Name. If I only wanted the Linux agents, how would I get just the Name? First, you want to narrow it down to the lines you want:
./ --list_agents | grep "OS: Linux"

This will remove all the other agents from the list and give me:
OS: Linux, Level: 2.4, Version: 3.8, Name: Disk
OS: Linux, Level: 2.4, Version: 3.8, Name: Kernel

Now I need to parse each line. If I use the above command in a for loop I can start with:
for a in `./ --list_agents | grep "OS: Linux"`; do
echo $a

Now I can try adding to the backtick command to narrow things down. The two ways I like to parse a line is using awk or cut. For cut I could use:
for a in `./ --list_agents | grep "OS: Linux" | cut -d: -f5`; do
echo $a

This will break the line at the colon. The cut on the first line would give the fields:
  1. OS
  2. Linux, Level
  3. 2.4, Version
  4. 3.8, Name
  5. Disk

The problem is there is a space in front of Disk. I can add a cut -b2-, which will give me from character 2 to the end, i.e. cut off the first character. What if there is more than one space? This is why I like to use awk. For awk it would be:
for a in `./ --list_agents | grep "OS: Linux" | awk '{print $8}'`; do
echo $a

For awk the fields would become:
  1. OS:
  2. Linux,
  3. Level:
  4. 2.4,
  5. Version:
  6. 3.8,
  7. Name:
  8. Disk

The spaces would not be an issue.

So by using backticks, piping and grep I can break things apart into just the lines I want. Piping the result of grep to cut or awk to break the line apart and keep just the bits I want.

The only other command I like to use for parsing output like this is sed. I can use sed for things like:
cat file | sed -e '/^$/d'

The // is a regex pattern. The ^ means beginning of line. The $ means end of line. So ^$ would be a blank line. The d is for delete. This will delete blank lines.

Actually, lets give an example usage. I want to list all files in a given directory plus all subdirectories. I want the file size for each file. The ls -lR will give me a listing like:
total 4
drwxrwxrwx+ 2 Darrell None   0 Apr 19 14:56 ListCarFiles
drwxr-xr-x+ 2 Darrell None   0 May  7 21:58 bin
-rw-rw-rw-  1 Darrell None 631 Oct 17  2006 cvsroots

total 8
-rwxrwxrwx 1 Darrell None 2158 Mar 30 22:37 ListCarFiles.class
-rwxrwxrwx 1 Darrell None 1929 Mar 31 09:09

total 4
-rwxr-xr-x 1 Darrell None 823 May  7 21:58

To get rid of the blank likes I can use the sed -e '/^$/d'. To get rid of the path information I can use grep -v ":", assuming there are no colons in the filenames. To get rid of the directories I can use sed -e '/^d/d' because all directory lines start with a 'd'. So the whole thing looks like:
ls -lR | sed -e '/^$/d' -e '/^d/d' | grep -v ":"

But there is actually an easier answer. Rather than cutting out what I don't want, I can use sed to keep what I do want. The sed -n command will output nothing BUT if the script has a 'p' command it will print that. So I want to sed -n with the right 'p' commands. Here is the solution:
ls -lR | sed -n -e '/^-/p'

This is because all the files have '-' at the start of the line. This will output:
-rw-rw-rw-  1 Darrell None 631 Oct 17  2006 cvsroots
-rwxrwxrwx 1 Darrell None 2158 Mar 30 22:37 ListCarFiles.class
-rwxrwxrwx 1 Darrell None 1929 Mar 31 09:09
-rwxr-xr-x 1 Darrell None 823 May  7 21:58

I can now use awk to cut the file size out, i.e. awk '{print $5}'. So the whole command becomes:
ls -lR | sed -n -e '/^-/p' | awk '{print $5}'

If I want to add all the file sizes for a total I can use:
for fs in `ls -lR | sed -n -e '/^-/p' | awk '{print $5}'`; do:
TOTAL=`expr $TOTAL + $fs`
echo $TOTAL

The expr will let me do simple integer match with the output.

NOTE: you use use man to learn more about the various commands I've shown here:

  • man grep
  • man cut
  • man awk
  • man sed
  • man regex
  • man expr

The sed and awk commands are actually powerful enough to have entire chapters written on them. But the man page will get you started.

While you are at it, do a man man.


No comments: