KEMBAR78
L5_regular expression command for linux unix | PPT
Regular expressions
 Used by several different UNIX commands, including
ed, sed, awk, grep
 A period ‘.’ matches any single characters
 .X. matches any X that is surrounded by any two
characters
 Caret character ^ matches the beginning of the line
 ^Bridgeport matches the characters Bridgeport only if
they occur at the beginning of the line
Regular expressions (continue.)
 A dollar sign ‘$’ is used to match the end of the line
 Bridgeport$ will match the characters Bridgeport only
they are the very last characters on the line
 $ matches any single character at the end of the line
 To match any single character, this character should be
preceded by a backslash ‘’ to remove the special
meaning
 .$ matches any line end with a period
Regular expressions (continue.)
 ^$ matches any line that contains no characters
 […] is used to match any character enclosed in […]
 [tT] matches a lower or upper case t followed
immediately by the characters
 [A-Z] matches upper case letter
 [A-Za-z] matches upper or lower case letter
 [^A-Z] matches any character except upper case letter
 [^A-Za-z] matches any non alphabetic character
Regular expressions (continue.)
 (*) Asterisk matches zero or more characters
 X* matches zero, one, two, three, … capital X’s
 XX* matches one or more capital X’s
 .* matches zero or more occurrences of any characters
 e.*e matches all the characters from the first e in the
line to the last one
 [A-Za-z] [A-Za-z] * matches any alphabetic character
followed by zero or more alphabetic character
Regular expressions (continue.)
 [-0-9] matches a single dash or digit character
(ORDER IS IMPORTANT)
 [0-9-] same as [-0-9]
 [^-0-9] matches any alphabetic except digits and dash
 []a-z] matches a right bracket or lower case letter
(ORDER IS IMPORTANT)
Regular expressions (continue.)
 {min, max} matches a precise number of characters
 min specifies the minimum number of occurrences of the
preceding regular expression to be matched, and max
specifies the maximum
 w{1,10} matches from 1 to 10 consecutive w’s
 [a-zA-Z]{7} matches exactly seven alphabetic characters
Regular expressions (continue.)
 X{5,} matches at least five consecutive X’s
 (….) is used to save matched characters
 ^(.) matches the first character on the line and store it
into register one
 There is 1-9 registers
 To retrieve what is stored in any register n is used
 Example: ^(.)1 matches the first two characters on a
line if they are both the same characters
Regular expressions (continue.)

^(.).*1$ matches all lines in which the first
character on the line is the same as the last.
Note (.*) matches all the characters in-between

^(…)(…) the first three characters on the line
will be stored into register 1 and the next three
characters into register 2
cut
$ who
bgeorge pts/16 Oct 5 15:01 (216.87.102.204)
abakshi pts/13 Oct 6 19:48 (216.87.102.220)
tphilip pts/11 Oct 2 14:10 (AC8C6085.ipt.aol.com)
$ who | cut -c1-8,18-
bgeorge Oct 5 15:01 (216.87.102.204)
abakshi Oct 6 19:48 (216.87.102.220)
tphilip Oct 2 14:10 (AC8C6085.ipt.aol.com)
$
 Used in extracting various fields of data from a data file or the
output of a command
Format: cut -cchars file
 chars specifies what characters to extract from each line of file.
cut (continue.)
 Example: -c5, -c1,3,4 -c-10-15 -c5-
 The –d and –f options are used with cut when you
have data that is delimited by a particular
character
 Format: cut –ddchars –ffields file
 dchar: delimiters of the fields (default: tab
character)
 fields: fields to be extracted from file
cut (continue.)
$ cat phonebook
Edward 336-145
Alice 334-121
Sony 332-336
Robert 326-056
$ cut -f1 phonebook
Edward
Alice
Sony
Robert
$
cut (continue.)
$ cat /etc/passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
sys:x:3:3::/:
adm:x:4:4:Admin:/var/adm:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
listen:x:37:4:Network Admin:/usr/net/nls:
nobody:x:60001:60001:Nobody:/:
noaccess:x:60002:60002:No Access User:/:
oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh
webuser:*:102:102:Web User:/export/home/webuser:/bin/csh
abuzneid:x:103:100:Abdelshakour Abuzneid:/home/abuzneid:/sbin/csh
$
cut (continue.)
$ cut -d: -f1 /etc/passwd
root
daemon
bin
sys
adm
lp
uucp
nuucp
listen
nobody
oracle
webuser
abuzneid
$
paste
 Format: paste files
 tab character is a default delimiter
paste (continue.)
 Example:
$ cat students
Sue
Vara
Elvis
Luis
Eliza
$ cat sid
578426
452869
354896
455468
335123
$ paste students sid
Sue 578426
Vara 452869
Elvis 354896
Luis 455468
Eliza 335123
$
paste (continue.)
 The option –s tells paste to paste together
lines from the same file not from alternate
files
 To change the delimiter, -d option is used
paste (continue.)
 Examples:
$ paste -d '+' students sid
Sue+578426
Vara+452869
Elvis+354896
Luis+455468
Eliza+335123
$ paste -s students
Sue Vara Elvis Luis Eliza
$ ls | paste -d ' ' -s -
addr args list mail memo name nsmail phonebook programs roster sid
students test tp twice user
$
sed
 sed (stream editor) is a program used for editing
data
 Unlike ed, sed can not be used interactively
 Format: sed command file
 command: applied to each line of the specified file
 file: if no file is specified, then standard input is
assumed
 sed writes the output to the standard output
 s/Unix/UNIX command is applied to every line in
the file, it replaces the first Unix with UNIX
sed (continue.)
 sed makes no changes to the original input file
 ‘s/Unix/UNIX/g’ command is applied to every line in the
file. It replaces every Unix with UNIX. “g” means global
 With –n option, selected lines can be printed
 Example: sed –n ’1,2p’ file which prints the first two
lines
 Example: sed –n ‘/UNIX/p’ file, prints any line
containing UNIX
sed (continue.)
 Example: sed –n ‘/1,2d/’ file, deletes lines 1 and 2
 Example: sed –n’ /1’ text, prints all lines from
text,
showing non printing characters as nn and tab
characters as “>”
tr
 The tr filter is used to translate characters from standard
input
 Format: tr from-chars to-chars
 Result is written to standard output
 Example tr e x <file, translates every “e” in file to “x” and
prints the output to the standard output
 The octal representation of a character can be given to “tr”
in the format nnn
 Example: tr : ‘11’ will translate all : to tabs
tr (continue.)
Character Octal value
Bell 7
Backspace 10
Tab 11
New line 12
Linefeed 12
Form feed 14
Carriage return 15
Escape 33
tr (continue.)
 Example: tr ‘[a-z]’’[A-Z]’ < file translate all lower
case letters in file to their uppercase equivalent.
The characters ranges [a-z] and [A-Z] are
enclosed in quotes to keep the shell from replacing
them with all files named from a through z and A
through Z
 To “squeeze” out multiple occurrences of
characters the –s option is used
tr (continue.)
 Example: tr –s ’ ’ ‘ ‘ < file will squeeze multiple spaces
to one space
 The –d option is used to delete single characters from a
stream of input
 Format: tr –d from-chars
 Example: tr –d ‘ ‘ < file will delete all spaces from the
input stream
grep
 Searches one or more files for a particular
characters patterns
 Format: grep pattern files
 Example: grep path .cshrc will print every line
in .cshrc file which has the pattern ‘path’ and print
it
 Example: grep bin .cshrc .login .profile will print
every line from any of the three files .cshrc, .login
and .profile which has the pattern “bin”
grep (continue.)

Example : grep * smarts will give an
error because * will be substituted with
all file in the correct directory
 Example : grep ‘*’ smarts
*
smarts
grep
arguments
sort
 By default, sort takes each line of the specified input file and
sorts it into ascending order
$ cat students
Sue
Vara
Elvis
Luis
Eliza
$ sort students
Eliza
Elvis
Luis
Sue
Vara
$
sort (continue.)
 The –n option tells sort to eliminate
duplicate lines from the output
sort (continue.)
$ echo Ash >> students
$ cat students
Sue
Vara
Elvis
Luis
Eliza
Ash
Ash
$ sort students
Ash
Ash
Eliza
Elvis
Luis
Sue
Vara
sort (continue.)
 The –s option reverses the order of the sort
 The –o option is used to direct the input from the
standard output to file
 sort students > sorted_students works as sort
students –o sorted_students
 The –o option allows to sort file and saves the output
to the same file
 Example:
sort students –o students correct
sort students > students incorrect
sort (continue.)
• The –n option specifies the first field for sort
as number and data to sorted arithmetically
sort (continue.)
$ cat data
-10 11
15 2
-9 -3
2 13
20 22
3 1
$ sort data
-10 11
-9 -3
15 2
2 13
20 22
3 1
$
sort (continue.)
 To sort by the second field +1n should be used
instead of n. +1 says to skip the first field
 +5n would mean to skip the first five fields on
each line and then sort the data numerically
sort (continue.)
 Example
$ sort -t: +2n /etc/passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
sys:x:3:3::/:
adm:x:4:4:Admin:/var/adm:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico
listen:x:37:4:Network Admin:/usr/net/nls:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh
webuser:*:102:102:Web User:/export/home/webuser:/bin/csh
y:x:60001:60001:Nobody:/:
$
uniq
 Used to find duplicate lines in a file
 Format: uniq in_file out_file
 uniq will copy in_file to out_file removing
any duplicate lines in the process
 uniq’s definition of duplicated lines are
consecutive-occurring lines that match
exactly
uniq (continue.)
$ cat students
Sue
Vara
Elvis
Luis
Eliza
Ash
Ash
$ uniq students
Sue
Vara
Elvis
Luis
Eliza
Ash
$
 The –d option is used to list duplicate lines
 Example:
References
 UNIX SHELLS BY EXAMPLE BY ELLIE
QUIGLEY
 UNIX FOR PROGRAMMERS AND USERS BY
G. GLASS AND K ABLES
 UNIX SHELL PROGRAMMING BY S.
KOCHAN AND P. WOOD

L5_regular expression command for linux unix

  • 1.
    Regular expressions  Usedby several different UNIX commands, including ed, sed, awk, grep  A period ‘.’ matches any single characters  .X. matches any X that is surrounded by any two characters  Caret character ^ matches the beginning of the line  ^Bridgeport matches the characters Bridgeport only if they occur at the beginning of the line
  • 2.
    Regular expressions (continue.) A dollar sign ‘$’ is used to match the end of the line  Bridgeport$ will match the characters Bridgeport only they are the very last characters on the line  $ matches any single character at the end of the line  To match any single character, this character should be preceded by a backslash ‘’ to remove the special meaning  .$ matches any line end with a period
  • 3.
    Regular expressions (continue.) ^$ matches any line that contains no characters  […] is used to match any character enclosed in […]  [tT] matches a lower or upper case t followed immediately by the characters  [A-Z] matches upper case letter  [A-Za-z] matches upper or lower case letter  [^A-Z] matches any character except upper case letter  [^A-Za-z] matches any non alphabetic character
  • 4.
    Regular expressions (continue.) (*) Asterisk matches zero or more characters  X* matches zero, one, two, three, … capital X’s  XX* matches one or more capital X’s  .* matches zero or more occurrences of any characters  e.*e matches all the characters from the first e in the line to the last one  [A-Za-z] [A-Za-z] * matches any alphabetic character followed by zero or more alphabetic character
  • 5.
    Regular expressions (continue.) [-0-9] matches a single dash or digit character (ORDER IS IMPORTANT)  [0-9-] same as [-0-9]  [^-0-9] matches any alphabetic except digits and dash  []a-z] matches a right bracket or lower case letter (ORDER IS IMPORTANT)
  • 6.
    Regular expressions (continue.) {min, max} matches a precise number of characters  min specifies the minimum number of occurrences of the preceding regular expression to be matched, and max specifies the maximum  w{1,10} matches from 1 to 10 consecutive w’s  [a-zA-Z]{7} matches exactly seven alphabetic characters
  • 7.
    Regular expressions (continue.) X{5,} matches at least five consecutive X’s  (….) is used to save matched characters  ^(.) matches the first character on the line and store it into register one  There is 1-9 registers  To retrieve what is stored in any register n is used  Example: ^(.)1 matches the first two characters on a line if they are both the same characters
  • 8.
    Regular expressions (continue.)  ^(.).*1$matches all lines in which the first character on the line is the same as the last. Note (.*) matches all the characters in-between  ^(…)(…) the first three characters on the line will be stored into register 1 and the next three characters into register 2
  • 9.
    cut $ who bgeorge pts/16Oct 5 15:01 (216.87.102.204) abakshi pts/13 Oct 6 19:48 (216.87.102.220) tphilip pts/11 Oct 2 14:10 (AC8C6085.ipt.aol.com) $ who | cut -c1-8,18- bgeorge Oct 5 15:01 (216.87.102.204) abakshi Oct 6 19:48 (216.87.102.220) tphilip Oct 2 14:10 (AC8C6085.ipt.aol.com) $  Used in extracting various fields of data from a data file or the output of a command Format: cut -cchars file  chars specifies what characters to extract from each line of file.
  • 10.
    cut (continue.)  Example:-c5, -c1,3,4 -c-10-15 -c5-  The –d and –f options are used with cut when you have data that is delimited by a particular character  Format: cut –ddchars –ffields file  dchar: delimiters of the fields (default: tab character)  fields: fields to be extracted from file
  • 11.
    cut (continue.) $ catphonebook Edward 336-145 Alice 334-121 Sony 332-336 Robert 326-056 $ cut -f1 phonebook Edward Alice Sony Robert $
  • 12.
    cut (continue.) $ cat/etc/passwd root:x:0:1:Super-User:/:/sbin/sh daemon:x:1:1::/: bin:x:2:2::/usr/bin: sys:x:3:3::/: adm:x:4:4:Admin:/var/adm: lp:x:71:8:Line Printer Admin:/usr/spool/lp: uucp:x:5:5:uucp Admin:/usr/lib/uucp: listen:x:37:4:Network Admin:/usr/net/nls: nobody:x:60001:60001:Nobody:/: noaccess:x:60002:60002:No Access User:/: oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh webuser:*:102:102:Web User:/export/home/webuser:/bin/csh abuzneid:x:103:100:Abdelshakour Abuzneid:/home/abuzneid:/sbin/csh $
  • 13.
    cut (continue.) $ cut-d: -f1 /etc/passwd root daemon bin sys adm lp uucp nuucp listen nobody oracle webuser abuzneid $
  • 14.
    paste  Format: pastefiles  tab character is a default delimiter
  • 15.
    paste (continue.)  Example: $cat students Sue Vara Elvis Luis Eliza $ cat sid 578426 452869 354896 455468 335123 $ paste students sid Sue 578426 Vara 452869 Elvis 354896 Luis 455468 Eliza 335123 $
  • 16.
    paste (continue.)  Theoption –s tells paste to paste together lines from the same file not from alternate files  To change the delimiter, -d option is used
  • 17.
    paste (continue.)  Examples: $paste -d '+' students sid Sue+578426 Vara+452869 Elvis+354896 Luis+455468 Eliza+335123 $ paste -s students Sue Vara Elvis Luis Eliza $ ls | paste -d ' ' -s - addr args list mail memo name nsmail phonebook programs roster sid students test tp twice user $
  • 18.
    sed  sed (streameditor) is a program used for editing data  Unlike ed, sed can not be used interactively  Format: sed command file  command: applied to each line of the specified file  file: if no file is specified, then standard input is assumed  sed writes the output to the standard output  s/Unix/UNIX command is applied to every line in the file, it replaces the first Unix with UNIX
  • 19.
    sed (continue.)  sedmakes no changes to the original input file  ‘s/Unix/UNIX/g’ command is applied to every line in the file. It replaces every Unix with UNIX. “g” means global  With –n option, selected lines can be printed  Example: sed –n ’1,2p’ file which prints the first two lines  Example: sed –n ‘/UNIX/p’ file, prints any line containing UNIX
  • 20.
    sed (continue.)  Example:sed –n ‘/1,2d/’ file, deletes lines 1 and 2  Example: sed –n’ /1’ text, prints all lines from text, showing non printing characters as nn and tab characters as “>”
  • 21.
    tr  The trfilter is used to translate characters from standard input  Format: tr from-chars to-chars  Result is written to standard output  Example tr e x <file, translates every “e” in file to “x” and prints the output to the standard output  The octal representation of a character can be given to “tr” in the format nnn  Example: tr : ‘11’ will translate all : to tabs
  • 22.
    tr (continue.) Character Octalvalue Bell 7 Backspace 10 Tab 11 New line 12 Linefeed 12 Form feed 14 Carriage return 15 Escape 33
  • 23.
    tr (continue.)  Example:tr ‘[a-z]’’[A-Z]’ < file translate all lower case letters in file to their uppercase equivalent. The characters ranges [a-z] and [A-Z] are enclosed in quotes to keep the shell from replacing them with all files named from a through z and A through Z  To “squeeze” out multiple occurrences of characters the –s option is used
  • 24.
    tr (continue.)  Example:tr –s ’ ’ ‘ ‘ < file will squeeze multiple spaces to one space  The –d option is used to delete single characters from a stream of input  Format: tr –d from-chars  Example: tr –d ‘ ‘ < file will delete all spaces from the input stream
  • 25.
    grep  Searches oneor more files for a particular characters patterns  Format: grep pattern files  Example: grep path .cshrc will print every line in .cshrc file which has the pattern ‘path’ and print it  Example: grep bin .cshrc .login .profile will print every line from any of the three files .cshrc, .login and .profile which has the pattern “bin”
  • 26.
    grep (continue.)  Example :grep * smarts will give an error because * will be substituted with all file in the correct directory  Example : grep ‘*’ smarts * smarts grep arguments
  • 27.
    sort  By default,sort takes each line of the specified input file and sorts it into ascending order $ cat students Sue Vara Elvis Luis Eliza $ sort students Eliza Elvis Luis Sue Vara $
  • 28.
    sort (continue.)  The–n option tells sort to eliminate duplicate lines from the output
  • 29.
    sort (continue.) $ echoAsh >> students $ cat students Sue Vara Elvis Luis Eliza Ash Ash $ sort students Ash Ash Eliza Elvis Luis Sue Vara
  • 30.
    sort (continue.)  The–s option reverses the order of the sort  The –o option is used to direct the input from the standard output to file  sort students > sorted_students works as sort students –o sorted_students  The –o option allows to sort file and saves the output to the same file  Example: sort students –o students correct sort students > students incorrect
  • 31.
    sort (continue.) • The–n option specifies the first field for sort as number and data to sorted arithmetically
  • 32.
    sort (continue.) $ catdata -10 11 15 2 -9 -3 2 13 20 22 3 1 $ sort data -10 11 -9 -3 15 2 2 13 20 22 3 1 $
  • 33.
    sort (continue.)  Tosort by the second field +1n should be used instead of n. +1 says to skip the first field  +5n would mean to skip the first five fields on each line and then sort the data numerically
  • 34.
    sort (continue.)  Example $sort -t: +2n /etc/passwd root:x:0:1:Super-User:/:/sbin/sh daemon:x:1:1::/: bin:x:2:2::/usr/bin: sys:x:3:3::/: adm:x:4:4:Admin:/var/adm: uucp:x:5:5:uucp Admin:/usr/lib/uucp: nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico listen:x:37:4:Network Admin:/usr/net/nls: lp:x:71:8:Line Printer Admin:/usr/spool/lp: oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh webuser:*:102:102:Web User:/export/home/webuser:/bin/csh y:x:60001:60001:Nobody:/: $
  • 35.
    uniq  Used tofind duplicate lines in a file  Format: uniq in_file out_file  uniq will copy in_file to out_file removing any duplicate lines in the process  uniq’s definition of duplicated lines are consecutive-occurring lines that match exactly
  • 36.
    uniq (continue.) $ catstudents Sue Vara Elvis Luis Eliza Ash Ash $ uniq students Sue Vara Elvis Luis Eliza Ash $  The –d option is used to list duplicate lines  Example:
  • 38.
    References  UNIX SHELLSBY EXAMPLE BY ELLIE QUIGLEY  UNIX FOR PROGRAMMERS AND USERS BY G. GLASS AND K ABLES  UNIX SHELL PROGRAMMING BY S. KOCHAN AND P. WOOD