Scripting‎ > ‎Linux shell‎ > ‎

awk examples

Thank you for visiting this page, this page has been update in another link Awk useful examples

Awk  is a pattern scanning and processing language, full-featured text processing language with a syntax reminiscent of C. While it possesses an extensive set of operators and capabilities, we will cover only a few of these here - the ones most useful in shell scripts.

Awk breaks each line of input passed to it into fields. By default, a field is a string of consecutive characters delimited by whitespace, though there are options for changing this. Awk parses and operates on each separate field. This makes it ideal for handling structured text files -- especially tables -- data organized into consistent chunks, such as rows and columns.

SYNOPSIS

       gawk [ POSIX or GNU style options ] -f program-file [ -- ] file ...
       gawk [ POSIX or GNU style options ] [ -- ] program-text file ...

       pgawk [ POSIX or GNU style options ] -f program-file [ -- ] file ...
       pgawk [ POSIX or GNU style options ] [ -- ] program-text file ...

The simple print

Let's see how it works. At the command line, enter the following command:

$ awk '{ print }' /etc/fstab or awk '{ print $0 }' /etc/fstab

You should see the contents of your /etc/fstab file as output, same as cat /etc/fstab. When we executed awk, it evaluated the print command for each line in /etc/passwd, in order. For an explanation of the { print } code block. In awk, curly braces are used to group blocks of code together, similar to C. Inside our block of code, we have a single print command. In awk, when a print command appears by itself, the full contents of the current line are printed, the $0 variable represents the entire current line, so print and print $0 do exactly the same thing.

$ awk '{ print "#" }' /etc/fstab

Not like the first example, it will not print the file /etc/fstab, instead, it prints one "#" for every line in /etc/fstab to your screen.

Deal with multiple fields

It works like cut, but more powerful than cut, which can only use single character as seperator. By default, it uses whitespace as separator.

The following script will print out a list of all user accounts on your system:

$ awk -F":" '{ print $1 }' /etc/fstab

Above, when we called awk, we use the -F option to specify ":" as the field separator. When awk processes the print $1 command, it will print out the first field that appears on each line in the input file. Here's another example:

# awk '{print $1 $3}' /etc/fstab
LABEL=/1ext3
LABEL=/tmpext3
LABEL=/homeext3
LABEL=/usrext3
...

awk prints out the first and third fields of the /etc/fstab file, but there is no space between two fields. It is because when two strings appear next to each other in an awk program, awk concatenates them without adding an intermediate space. You can do the following ways to have fields separated without specify -OFS.

$ awk -F":" '{ print $1 " " $3 }' /etc/passwd 

This way, it'll concatenate $1, " ", and $3, creating readable output.  

Output sepsrator

awk -F":" '{ print $1,$3}' /etc/passwd
This way, awk uses default seperator 'OFS', a single space.
If you want to assign a different seperator, for example, a tab
$awk -F":" --assign OFS="\t" '{print "user:"$1,"uid:"$3}' /etc/passwd
user:root uid:0
user:bin uid:1
user:daemon uid:2
user:adm uid:3
...

Another way to seperate output fields using different seperator, for example a tab

$ awk -F":" '{ print "username: " $1 "\t\tuid:" $3 }' /etc/passwd

This will cause the output to be:

$awk -F":" '{ print "username: " $1 "\t\tuid:" $3 }' /etc/passwd
username: root uid:0
username: bin uid:1
username: daemon uid:2
username: adm uid:3
username: lp uid:4

...

Search pattern

awk search pattern is a regular expression, for example, search and print lines with ext string

# awk '/ext/  {print }' /etc/fstab
LABEL=/1                /                       ext3    defaults        1 1
LABEL=/tmp              /tmp                    ext3    defaults        1 2
LABEL=/home             /home                   ext3    defaults        1 2
LABEL=/usr              /usr                    ext3    defaults        1 2

print uncommented out lines in the file /etc/fstab

# awk '$0 !~ "^#" {print}' /etc/fstab
LABEL=/1                /                       ext3    defaults        1 1
LABEL=/tmp              /tmp                    ext3    defaults        1 2
LABEL=/home             /home                   ext3    defaults        1 2
LABEL=/usr              /usr                    ext3    defaults        1 2
LABEL=/opt              /opt                    ext3    defaults        1 2
...

print file systems that kernel will mount by default.

# awk '$4 == "defaults" && $1 !~ "^#"  {print}' /etc/fstab
LABEL=/1                /                       ext3    defaults        1 1
LABEL=/tmp              /tmp                    ext3    defaults        1 2
LABEL=/home             /home                   ext3    defaults        1 2
LABEL=/usr              /usr                    ext3    defaults        1 2

The BEGIN and END blocks

Normally, awk executes each block of your script's code once for each input line. However, there are many programming situations where you may need to execute initialization code before awk begins processing the text from the input file. For such situations, awk allows you to define a BEGIN block. We used a BEGIN block in the previous example. Because the BEGIN block is evaluated before awk starts processing the input file, it's an excellent place to initialize the FS (field separator) variable, print a heading, or initialize other global variables that you'll reference later in the program.

Awk also provides another special block, called the END block. Awk executes this block after all lines in the input file have been processed. Typically, the END block is used to perform final calculations or print summaries that should appear at the end of the output stream.

# awk 'BEGIN{FS=":";OFS="\t\t"; print "username\tuid"}  {print $1,$3}' /etc/passwd
username    uid
root        0
bin        1
daemon        2
adm        3

Another fine print control example: using printf

# awk 'BEGIN{FS=":";OFS="\t\t"; print "username\tuid"}  {printf "%8s\t%d\n", $1,$3}' /etc/passwd
username    uid
    root    0
     bin    1
  daemon    2
     adm    3
      lp    4
Note: in the example above, OFS is ignored

Conditional statements

Awk also offers very nice C-like if statements.

{ if ( $5 ~ /root/ ) { print $3 } }
In the example, the block is executed for every input line

Here's a more complicated example of an awk if statement. As you can see, even with complex, nested conditionals, if statements look identical to their C counterparts:

{ if ( $1 == "foo" ) { if ( $2 == "foo" ) { print "uno" } else { print "one" } } else if ($1 == "bar" ) { print "two" } else { print "three" } }

Numeric variables

So far, we've either printed strings, the entire line, or specific fields. However, awk also allows us to perform both integer and floating point math. Using mathematical expressions, it's very easy to write a script that counts the number of blank lines in a file. Here's one that does just that:

BEGIN { x=0 } 
/^$/  { x=x+1 } 
END   { print "I found " x " blank lines. :)" }

In the BEGIN block, we initialize our integer variable x to zero. Then, each time awk encounters a blank line, awk will execute the x=x+1 statement, incrementing x. After all the lines have been processed, the END block will execute, and awk will print out a final summary, specifying the number of blank lines it found.



Comments