Unix operating system, The shell interpretive cycle in Unix operating system, wildcards, redirecting the files, connecting command :pipe(||), grap commands
The shells interpretive cycle :
- The shell issues the prompt and waits for you to enter a command.
- After the command is entered, the shell scans the command line for metacharacters and expands abbreviations (like the * in rm *) to recreate a simplified command line.
- It then passes on the command line to the kernel for execution.
- The shell waits for the command to complete and normally can’t do any work while the command is running.
- After the command execution is complete, the prompt reappears and the shell returns to its waiting role to start the next cycle. Now the user is free to enter another command.
Wild cards :
- Wild cards are the set of characters that the shell uses to match filenames.
- The shell will expand all the wild cards before passing the command to the kernel for execution.
The * and ?
- The command ls chap * to list some filenames beginning with chap.
$ ls chap*chap chap01 chap02 chap03 chap04 chap15 chap16 chap17 chapxchapy chapz
$ ls chap?chapx chapy chapz
( note: * and ? doesn’t match all files beginning with a .(dot) or the / of a pathname )
The character class [ ] :
- The character class allows the user to frame more restrictive patterns.
- The character class comprises a set of characters enclosed by the rectangular brackets, [ and ], but it matches a single character in the class
$ ls chap0[124]chap01 chap02 chap04
Range specification is also possible inside the class with a – (hyphen); the two characters on either side of it form the range of characters to be matched.
$ ls chap0[1-4]
chap01 chap02 chap04
$ls chap[x-z]
chapx chapy chapz
Negating the character class (!) you can use the ! as the first character in the class to negate the class.
- *.[!co] Matches all filenames with a single character extension but not the .c or .o files
- [!a-zA-Z]* Matches all the filenames that don’t begin with an alphabetic character.
Matching totally dissimilar patterns { } :
- The dissimilar patterns should be written within the flower brackets { and } separated by comma.
- {c, java} Matches the pattern either c or java
- {include, bin, lib} Matches pattern either include, bin or lib.
Removing the special meanings of wild cards:
Escaping:
- Providing a \ (backslash) before the wildcard to remove (escape) its special meaning.
- For instance, in the pattern \*, the \ tells the shell that the asterisk has to be matched literally instead of being interpreted as a metacharacter.
rm chap/*Doesn’t remove chap1, chap2
The \ suppresses the wild-card nature of the *, thus preventing the shell from performing the filename expansion on it.
$ ls chap0\[1-3\]
chap0[1-3]
Escaping the space
$ rm My\ Document.docwithout \ rm would see two files
Escaping the \ itself sometimes we may need to interpret the \ itself literally. You need another \ before it, that’s all:
$ echo \\\
$ echo the newline character is \\nthe newline character is \n
Quoting:
- Enclosing the wild-card, or even the entire pattern, within quotes, anything within these quotes are left alone by the shell and not interpreted.
- The following example shows the protection of the four special characters using single quotes:
$ echo ‘the characters |, <, > and $ are also special'the characters |, <, > and $ are also special
- Single quotes protect all special characters ( except single quotes ).
- Double quotes are more permissive; they don’t protect ( apart from double quotes itself) the $ and ` (backquote).
Redirection: The Three Standard Files :
- The Shell associates three files with the terminal - two for the display and one for the keyboard.
- The command performs all the terminal – related activities with these three files that are provided by the shell.
- These special files are actually streams of characters which many commands see as input and output. A stream is simply a sequence of bytes
Standard Input:
- The file (or stream) representing input, which is connected to the keyboard.
- When the commands are used without the filename arguments they read the file representing the standard input. This file is indeed special;
- The keyboard, the default source.
- A file using redirection with the < symbol ( a metacharacter ).
- Another program using a pipeline.
Redirection ( < ) :
- shell can reassign the standard input file to a disk file. This means it can redirect the standard input to originate from a file on disk. This reassignment or redirection requires the < symbol:
$ wc < sample.txt3 14 71
- wc command didn’t open sample.txt.it read the standard input file as a stream but only after the shell reassigned this stream to a disk file (sample.txt).
$ bc < math.txt
- bc command will read input from the standard input which is assigned to math.txt using < symbol.
Standard output:
- All commands displaying output on the terminal actually write to the standard output file as streams of characters.
- The terminal, the default source.
- A file using redirection with the symbols > and >> ( a metacharacter ).
- As input to another program using a pipeline.
Redirection ( > and >>) :
$ wc sample.txt > newfile$ cat newfile3 14 71 sample.txt
- The first command sends the word count of sample.txt to newfile; nothing appears on the terminal screen. If the output file doesn’t exist, the shell creates it before executing the command.
- If it exists, the shell overwrites it.
- The shell also provides >> symbol to append to a file ( prevents overwriting ).
$ wc sample.txt >> newfile
Standard error :
- Each of the three standard files is represented by a number, called a file descriptor.
- A file is opened by referring to its pathname, but subsequent read and write operations identify the file by this file descriptor.
0 – Standard input1- Standard output2- Standard error
- These descriptors are implicitly prefixed to the redirection symbols. For instance > and 1> mean the same thing to the shell, while < and 0< are identical.
- We need to explicitly use one of these descriptors for handing the standard error stream.
- Redirecting the standard error requires the use of the 2> symbols
$ cat foo 2> errorfile$ cat errorfilecat: cannot open foo
- You can append standard error to a file :
$ cat foo 2>> errorfile
Connecting commands: Pipe ( | ) :
- Pipe allows the standard input stream to connect with the standard output stream such that one command can take input from another.
$ who | wc –l(counts the number of online users )5
- Here the shell connects the who’s standard output to wc’s standard input using a special operator called pipe ( | ).
- The output of who has been passed directly to the input of wc, and who is said to be piped to wc.
- When multiple commands are connected this way, a pipeline is said to be formed.
- It’s the shell that set up the connection and the commands have no knowledge of it.
- One can count the number of files in a directory by combining ls and wc –w commands using a pipe.
$ ls | wc –w
(counts the number of files and subdirectories )15
- There is no restriction in the number of commands you can use in a pipeline.
Example Database:
- Several UNIX commands are provided for text editing and shell programming. (emp.lst) - each line of this file has six fields separated by five delimiters.
- The details of an employee are stored in one single line. This text file designed in fixed format and containing a personnel database. There are 15 lines, where each field is separated by the delimiter |.
$ cat emp.lst2233 | a.k.shukla | g.m | sales | 12/12/52 | 6000
9876 | jai Sharma | director | production | 12/03/50 | 7000
5678 | sumit chakrobarty | d.g.m. | marketing | 19/04/43 | 6000
2365 | barun sengupta | director | personnel | 11/05/47 | 7800
5423 | n.k.gupta | chairman | admin | 30/08/56 | 5400
1006 | chanchal singhvi | director | sales | 03/09/38 | 6700
6213 | karuna ganguly | g.m. | accounts | 05/06/62 | 6300
1265 | s.n. dasgupta | manager | sales | 12/09/63 | 5600
4290 | jayant choudhury | executive | production | 07/09/50 | 6000
2476 | anil aggarwal | manager | sales | 01/05/59 | 5000
6521 | lalit chowdury | director | marketing | 26/09/45 | 8200
3212 | shyam saksena | d.g.m. | accounts | 12/12/55 | 6000
3564 | sudhir agarwal | executive | personnel | 06/07/47 | 7500
2345 | j. b. sexena | g.m. | marketing | 12/03/45 | 8000
0110 | v.k.agrawal | g.m.| marketing | 31/12/40 | 9000
grep – searching for a pattern :
- It scans the file / input for a pattern and displays lines containing the pattern, the line numbers or filenames where the pattern occurs.
- It’s a command from a special family in UNIX for handling search requirements.
grep options pattern filename(s)grep “sales” emp.lst
- will display lines containing sales from the file emp.lst. Patterns with and without quotes is possible. It’s generally safe to quote the pattern.
- Quote is mandatory when pattern involves more than one word. It returns the prompt in case the pattern can’t be located.
grep president emp.lst
- When grep is used with multiple filenames, it displays the filenames along with the output.
grep “director” emp1.lst emp2.lst
- Where it shows filename followed by the contents
grep options :
- grep is one of the most important UNIX commands, and we must know the options that POSIX requires grep to support. Linux supports all of these options.
-i ignores case for matching-v doesn’t display lines matching expression-n displays line numbers along with lines-c displays count of number of occurrences-l displays list of filenames only-e exp specifies expression with this option-x matches pattern with entire line-f takes patterns from file, one per line-E treats pattern as an extended RE-F matches multiple fixed strings
EXAMPLE :
grep –n ‘marketing’ emp.lst
grep –c ‘director’ emp.lst
grep –c ‘director’ emp*.lst
will print filenames prefixed to the line count
grep –l ‘manager’ *.lstwill display filenames only
grep –e ‘Agarwal’ –e ‘aggarwal’ –e ‘agrawal’ emp.lstwill print matching multiple patterns
grep –f pattern.lst emp.lstall the above three patterns are stored in a separate file pattern.lst.