Pub Talk Part III
'Get me two pints waiter, I've got a lot to talk!'
Working on chains
'No, I'm not going to get you on a hard work! I'm just going to talk about chains of characters.'
The cut command
'Let me show you first, a very practical instruction: the
cut command. This instruction may be used to cut a certain piece of a file and it may be used in two different ways'
The cut command and the option -c
'With the option
-c, the syntax of the command is the following:'
cut -c PosIni-PosFim [file]
'In which:'
PosIni = Initial Position
PosFim = Final position
$ cat numbers
1234567890
0987654321
1234554321
9876556789
$ cut -c1-5 numbers
12345
09876
12345
98765
$ cut -c-6 numbers
123456
098765
123455
987655
$ cut -c4- numbers
4567890
7654321
4554321
6556789
$ cut -c1,3,5,7,9 numbers
13579
08642
13542
97568
$ cut -c -3,5,8- numbers
1235890
0986321
1235321
9875789
'As you can see, there are four different syntaxes: at the first one (
-c 1-5), I have especified a range, in the second one (
-c -6), I have especified everything up to one position, in the third (
-c 4-), from a certain point onward and in the fourth (
-c 1,3,5,7,9), certain positions. The last one (
-c -3,5,8-) was just to show that they can all be combined.'
The cut commandand the option -f
'Don't you think that it's finished. As you may have realized, this syntax is handful for fixed sized files, but actually, there are more files with variable sized fields, in which each field ends with a delimiting character. Let's take a look at the file
musics that we started in our last meeting.'
$ cat musics
album 1^Musician1~Music1:Musician2~Music2
album 2^Musician3~Music3:Musician4~Music4
album 3^Musician5~Music5:Musician6~Music5
album 4^Musician7~Music7:Musician8~Music8
'So, the layout is the following one:'
album^Musician1~music1:...:Musician~musicn
'That means that the name of the album is separated by a circumflex (or caret
^) from the rest of the register. This register is made of several groups (composed by: the singer of each song and the song). The artist and the name of the song are separated by a tilde (
~), and a colon (
:) separates the name of the song and the name of the singer.'
'So, in order to cut the piece of information that refers to the the second songs of the file
musics, we must do the following:'
$ cut -f2 -d: musics
Musician2~Music2
Musician4~Music4
Musician6~Music5
Musician8~Music8
'That means that we have cut the second field (
-f) delimited (
-d) with a colon (
:). But if we just wanted the names of the interpreters we should have used a different syntax:'
$ cut -f2 -d: musics | cut -f1 -d~
Musician2
Musician4
Musician6
Musician8
'In order to understand it, let's use the first line of musics:'
$ head -1 musics
album 1^Musician1~Music1:Musician2~Music2
'Watch me now:'
Delimitating the first
cut (
:)
album 1^Musician1~Music1:Musician2~Music2
'This way, at the first
cut, the first delimiting field (
-d) colon (
:) is
album 1^Musician1~Music1 and the second one, that is of our interest, is
Musician2~Music2. '
'Let's see then, what's happened to the second
cut:'
New delimitating character (
~)
Musician2~Music2
'Now the first field of the delimitating character (
-d) tilde (
~) is of our interest and it is
Musician2 and the second field is
Music2.'
'Considering that our first assumption was applied to the rest of the file, we'll get that same answer.'
If you cut, you paste
'As you may guess, the paste command pastes things. When we are dealing with shell, nevertheless, we talk about pasting files. In order to understand it, let's see:'
paste file1 file2
'This way, the command will send the registers from
file1 and from
file1 to the standard output (stdout). The registers of both files will be arranged side by side and if you do not define a delimitating character, the default one
will be used.'
'Paste is a rarely used command because of its syntax (not that it is hard, it is not well known). Let's play with two files:'
$ seq 10 > integer
$ seq 2 2 10 > even
'In order to check the content of the files, let's use the paste command in its more conventional way:'
$ paste integer even
1 2
2 4
3 6
4 8
5 10
6
7
8
9
10
Laying down
'Let's convert the column into a line:'
$ paste -s even
2 4 6 8 10
Using separators
'As we have said,
<TAB> is the default separator, but it can be changed with the option
-d. So, in order to calculate the sum, we would perform the following operation:'
$ paste -s -d'+' even # could be: -sd'+'
2+4+6+8+10
'Afterwards, we would paste that line to the calculator (
bc), then it would be like this:'
$ paste -sd'+' even | bc
30
'So, the factorial of the number defined by
$Num would be:'
$ seq $Num | paste -sd'*' | bc
'With the
paste command, it is also possible to employ some 'exotic' formats, like the following:'
$ ls | paste -s -d'\t\t\n'
file1 file2 file3
file4 file5 file6
'What has just happened was:
with the option
-s, the
past command converts lines into columns. The separators (oh, yeah! There might be more than one separator after each column that has been created) would be a
<TAB>, another
<TAB> and a
<ENTER>. So that the output would be presented in three columns.'
'Now that you got it, check how the same thing can be done, but in an easier (and less strange) way, using the same command, but with a different syntax:'
$ ls | paste - - -
file1 file2 file3
file4 file5 file6
'That happens because when we use a minus (
-), the
paste command substitutes the files for the standard input (or output). In our last example, the data of the files was sent to the standard output (
stdout), because the character
pipe (
|), changed the route of the
ls command to the standard input (
stdin) of the
paste command, but take a look at the following example:'
$ cat file1
precedence
privilegious
proportional
$ cat file2
position
mary
motion
$ cut -c-3 file1 | paste -d "" - file2
preposition
primary
promotion
'In that case, the
cut command has returned the three first letters of each register of
file1. The
paste command was designed not to employ a separator (
-d"") and to receive an input from the standard input (whose route has been changed by the pipe (
|) at the dash (
-), generating an output with
file2.'
The tr command
'Another very interesting command is
tr. It substitutes, compresses or removes characters. Its syntax follows the pattern:'
tr [options] string1 [string2]
'The
tr command copies the text from the standard input, changes the occurrences of the characters of
string1 by their correspondents in
string2 or changes multiple occurrences of the characters from
string1 for just one character, or it removes characters from
string1.'
The main
options are:
| Main options of the tr command |
-d char |
removes the characters char from the string |
| option |
meaning |
-s n |
compresses n occurrences of the string into one |
Changing characters with tr
'Let'me show you a silly example first'
$ echo silly | tr i a
sally
'That means, I have changed the occurrences of
i for
a.'
'Suppose that at a certain point of my script, the operator is asked to press
y or
n (Yes or No), and its answer is stored at the variable
$Resp. The content of that file could be capitalized or not, so, in order to avoid many tests to find out whether it is
N,
n,
Y,
y, I simply do the following:'
$ Resp=$(echo $Resp | tr YN yn)
'and afterwards, you can be sure that the content of the file will be whether a
n or a
y.'
'If my file
FileIn is all written in small letters and I wish to convert them into capital letters, what can I do?'
$ tr A-Z a-z <
FileIn? > /tmp/$$
$ mv -f /tmp/$$
ArqEnt?
'Take a look: I have used the notation
A-Z so that I would not need to write
ABCDEF....YZ. Other notations that could be used were those we call
escape sequences (which are common to other languages, like C) whose meaning you'll see below:'
| Escape Sequences |
| \\ |
Inverted Dash |
\0134 |
| Sequence |
Meaning |
Octal |
| \t |
Tab |
\011 |
| \n |
New line |
\012 |
| \v |
Vertical Tab |
\013 |
| \f |
Form Feed |
\014 |
| \r |
Carriage Return <^M> |
\015 |
Removing characters with tr
'Let-me tell you a tale: a student was quite mad at me, so he decided to make things worse for me and in a practical exercise, he handed me in a script in which the commands were separated by a semicolon (do you remember I said that semicolon is used to write many commands at the same line?).'
'I'll show you an example of such an aberration:'
$ cat confusion
echo Read an online shell book at http://www.julioneves.com > book;cat book;pwd;ls;rm -f trash 2>/dev/null;cd ~
'When the script was run, the answer was:'
$ confusion
Read an online shell book at http://www.julioneves.com
/home/jneves/LM
confusion book musexc musics musinc muslist number
'But, since I was meant to grade the script, I had to evaluate it seriously, so, to understand what he has done, I called him and in front of him, I ran the following command:'
$ tr ";" "\n" < confusion
echo Read an online shell book at http://www.julioneves.com
pwd
ls
rm -f trash 2>/dev/null
cd ~
'I don't have to tell you how disapointed he got when I, in a few seconds, undid the joke he had spent hours doing.'
'But, pay attention! If I were using a Unix system (with
ksh ou
sh), the command should be:'
$ tr ";" "\012" < confusion
Shrinking with tr
'See the difference between two executions of the date command (one I ran today and the other I ran two weeks ago):'
$ date # Today
Sun Sep 19 14:59:54 2004
$ date # Two weeks ago
Sun Sep 5 10:12:33 2004
'If I wanted to isolate the hour, I should do the following:'
$ date | cut -f 4 -d ' '
14:59:54
'On the other hand, two weeks ago, the answer would be:'
$ date | cut -f 4 -d ' '
5
'But pay attention to the following detail:'
$ date # Two weeks ago
Sun Sep 5 10:12:33 2004
'As you can see, there are two blank spaces before the number
5 (day). That ruins it all, because the third part is empty and the fourth is the day (
5). The ideal would be to compress the sucessive blank spaces into just one in order to work with the two strings. See how you can do it:'
$ date | tr -s " "
Sun Sep 5 10:12:33 2004
'You can see there is no more two spaces Now I can cut it:'
$ date | tr -s " " | cut -f 4 -d " "
10:12:33
'See how shell can be handful? Now, take a look at the following file, that originally came from that operational system that is vulnerable to all sorts of virus.'
$ cat -ve FileFromDOS.txt
This file^M$
was recorded by^M$
the Windows and^M$
downloaded by^M$
a badly done ftp.^M$
'Let-me give you two tips:'
Tip #1 - The option
-v of the cat command shows the invisible control characters, with the notation
^L, in which
^ stands for the control key and
L stands for the corresponding letter. The option
-e shows the end of the line with a dolar sign (
$).
Tip #2 - That happens because in Windows (or DOS) formated files, there is a carriage-return (
\r) and a line-feed (
\n) at the end of the registers. In Linux formated files, on the other hand, there is only a line-feed at the end of the registers.
'Let's clean the file now'
$ tr -d '\r' < FileFromDOS.txt > /tmp/$$
$ mv -f /tmp/$$ FileFromDOS.txt
'Check now what's happened:'
$ cat -ve FileFromDOS.txt
This file$
was recorded by$
the Windows and$
downloaded by$
a badly done ftp.$
'The option
-d of the
tr command removes a character (the one that has been specified) from the whole file. Thus, I have removed the unwishful characters saving the text in a temporary file (that afterwards became the substitute of the original file).'
If I were using a Unix machine (with
ksh ou
sh), the command should be:
$ tr -d '\015' < FileFromDOS.txt > /tmp/$$
$ mv -f /tmp/$$ FileFromDOS.txt

That has happened because ftp was run on binary mode (or image), and it means: no text interpretation. If, before the file transmission, the option ascii had been defined, that wouldn't have happened.
'Well, those hints are making me enjoy this shell stuff, but there are many things I still can't do.'
'Nevermind! There are still many things for you to learn about shell programming. But you are ready to solve a lot of problems using what you've learn as long as you adopt a "shell way of thinking". Are you able to make a script that tells me who's been logged in for more than one day at your server?'
'Surely not!! I would have to use conditional commands that I still don't know.'
'Why don't you change a little bit your way of thinking and come to the shell side of the force? Waiter, my pal, bring us some pints before we proceed...
'Now, that we have our pints, let's solve that problem. Pay attention to the who command:'
$ who
jneves pts/1 Sep 18 13:40
rtorres pts/0 Sep 20 07:01
rlegaria pts/1 Sep 20 08:19
lcarlos pts/3 Sep 20 10:01
'And also to the
date command '
$ date
Mon Sep 20 10:47:19 BRT 2004
'Now look: month and day are presented in the same format by both commands.'

Sometimes different commands present outputs in different languages. When that happens, you can do the following:
$ date
Mon Sep 20 10:47:19 BRT 2004
$ LANG=pt_BR date
Seg Set 20 10:47:19 BRT 2004
This way, you can print a bit of uniformity to the languages employed (The original language of this URL is portuguese from Brazil).
'Well, if there is a register of
who in which we don't find today's date, that means the user has been logged in for more than one day (considering that the user can't be logged in since tomorrow)... So, let's save the piece of data that is of our interest.'
$ Data=$(date | cut -c 5-10)
'I have used the construction
$(...), in order to priorize the execution of the commands before attributing its output to the variable
$Data. Let's see how it works:'
$ echo $Data
Sep 20
'Sweet! Now, what we should do, is to look for the registers that do not present that day in the output of the
who command.'
'I see! Well, and since you've mentioned the action of searching, I'm thinking of
grep. Am I right?'
'That is RIGHT! Very good! But I need to use
grep with that option that makes it list just the registers in which there is
not the string. Any idea?'
'Well, yeah... hummm... is it
-v?'
'In fact, it IS! You're getting good at it! So let's see:'
$ who | grep -v "$Data"
jneves pts/1 Sep 18 13:40
'And if I wanted something a little prettier, I would do the following:'
$ who | grep -v "$Data" | cut -f1 -d ' '
jneves
'See? No conditional was necessary. Specially when we consider that our conditional (
if) does not test conditions, but instructions, as we shall see.'
Conditional Commands
'Check the lines below:'
$ ls musics
musics
$ echo $?
0
$ ls FileThatDontExists
ls: FileThatDontExists: No such file or directory
$ echo $?
1
$ who | grep jneves
jneves pts/1 Sep 18 13:40 (10.2.4.144)
$ echo $?
0
$ who | grep juliana
$ echo $?
1
'What that
$? does? It looks like a variable, is it?'
'Yes, it is a variable that contains the returning code of the last instruction run. I can assure you that if this instruction succeeded,
$? equals zero, otherwise, it will be different.'
The if command
'The if command tests the variable
$?. Its syntax is:'
if cmd
then
cmd1
cmd2
...
cmdn
else
cmd3
cmd4
...
cmdm
fi
'That means: considering that the
cmd command has been successfully executed, the commands that compose the set called
then (
cmd1,
cmd2,
... &
cmdn) will be executed. Otherwise, the optional set of commands
else (composed by the
cmd3,
cmd4,
... &
cmdm commands), will be executed. The execution will be finished with a
fi.'
'Let's see how it works, using a small script that includes users at
/etc/passwd:'
$ cat incusu
#!/bin/bash
# Version 1
if grep ^$1 /etc/passwd
then
echo User \'$1\' already exists
else
if useradd $1
then
echo User \'$1\' included at /etc/passwd
else
echo "We have problems. Are you root?"
fi
fi
'Notice that the
if command tests the
grep command (that's its purpose). If the
if command succeeds (in the case above, that means: if the user - whose name is in
$1 - is found at
/etc/passwd) the
then set of commands is executed (in this example, only the
echo command). Otherwise, the instructions of the
else set are executed, where new
if tests whether the
useradd command works including the user
$1 in
/etc/passwd or not, in this case an error message is exhibited asking if the guy is root.'
'Let's check the command, firstly trying to execute it with a pre existent user:'
$ incusu jneves
jneves:x:54002:1001:Julio Neves:/home/jneves:/bin/bash
User 'jneves' already exists
'As we've seen a few times, an undesirable line was added by the output of the grep command. In order to avoid that problem, we should desviate the output of that command to /dev/null, like this:'
$ cat incusu
#!/bin/bash
# Version 2
if grep ^$1 /etc/passwd > /dev/null # or: if grep -q ^$1 /etc/passwd
then
echo User \'$1\' already exists
else
if useradd $1
then
echo User \'$1\' included at /etc/passwd
else
echo "We have problems. Are you root?"
fi
fi
'Now, let a normal user (not the root) test it:'
$ incusu JohnNobody
./incusu[6]: useradd: not found
We have problems. Are you root?
'Wow... that error was not supposed to occur! In order to avoid it, let's desviate the
useradd error output to
/dev/null, like this:'
$ cat incusu
#!/bin/bash
# Version 3
if grep ^$1 /etc/passwd > /dev/null # or: if grep -q ^$1 /etc/passwd
then
echo User \'$1\' already exists
else
if useradd $1 2> /dev/null
then
echo User \'$1\' included at /etc/passwd
else
echo "We have problems. Are you root?"
fi
fi
'After doing those changes, and executing a
su - (becoming root), let's see how it works:'
$ incusu xalaskero
User 'xalaskero' included at /etc/passwd
'Once again:'
$ incusu xalaskero
User 'xalaskero' already exists
'See? As I told you, as long as we talk and drink, our programming skills are improving. Let's see how we can enhance our music software:'
$ cat musinc
#!/bin/bash
# Include Musics (version 3)
#
if grep "^$1$" musicas > /dev/null
then
echo This album is already registered
else
echo $1 >> musics
sort musics -o musics
fi
'It's an evolution of the previous version, see? Instead of including a register (that could be duplicated in the previous version), we now test if the begining (
^) and the end (
$) of a register match to the informed parameter (
$1). A
^ is used at the begining of the string and a
$ is used at the end of it in order to test whether the parameter informed equals to some previously registered data.'
'Let's run it now, informing a previously registered album'
$ musinc "album 4^Musician7~Music7:Musician8~Music8"
This album is already registered
'And now a non registered one: '
$ musinc "album 5^Musician9~Music9:Musician10~Music10"
$ cat musicas
album 1^Musician1~Music1:Musician2~Music2
album 2^Musician3~Music3:Musician4~Music4
album 3^Musician5~Music5:Musician6~Music5
album 4^Musician7~Music7:Musician8~Music8
album 5^Musician9~Music9:Musician10~Music10
'As you've seen, our software is slowly improving, and it will get even better as long as we pass through these shell classes.'
'I've got all that you said, but I still don't get how I can do an if in order to test conditions, which I think that would be the main function of the command.'
'Dude, that's what the test command is for: to test conditions. The if command, on the other hand, tests the test command. Nevertheless, talking about it now would be way too complicated, moreover, I'm really thirsty. Let's have some beer and the next time I'll tell you about test and other if syntaxes.'
'Deal! Specially because I'm getting dizzy with that amount of information and it will give me some time to practice.'
'Why don't you write a little script that informs whether an user is logged on or not?? Meanwhile: WAITER?? two more pints, please...'
--
JulioNeves - 01 Aug 2006

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki-SL?
Send feedback