How to Use the awk Command on Linux

A Linux laptop with lines of code in a terminal window.Fatmawati Achmad Zaenuri / Shutterstock

On Linux, awk is a command line text manipulation dynamo, as well as a powerful scripting language. Here is an introduction to some of its most interesting features.

How awk got its name

The awk command was named using the initials of the three people who wrote the original version in 1977: Alfred Aho, Peter weinberger, and Brian Kernighan. These three men belonged to the legendary AT & T Bell Laboratories Unix Pantheon. With contributions from many others since then, awk has continued to evolve.

It is a full scripting language, as well as a full text manipulation toolkit for the command line. If this article appetites you, you can discover every detail on awk and its features.

Rules, models and actions

awk works on programs that contain rules made up of models and actions. The action is executed on the text which corresponds to the model. The patterns are surrounded by braces ({}). Together, a model and an action form a rule. The entire awk program is enclosed in single quotes (‘).

Let’s see the simplest type of awk program. It has no reason, so it corresponds to each line of text that is entered there. This means that the action is executed on each line. Well use it on the output of the one who commands.

Here’s the standard output from that:

who

We may not need all of this information, but just want to see the names on the accounts. We can direct who’s output to awk, and then tell awk to print only the first field.

By default, awk considers a field as a character string surrounded by spaces, the beginning of a line or the end of a line. Fields are identified by a dollar sign ($) and a number. Thus, $ 1 represents the first field, which we will use with the print action to print the first field.

We type the following:

who | awk ‘{print $ 1}’

awk prints the first field and removes the rest of the line.

We can print as many fields as we want. If we add a comma as a separator, awk displays a space between each field.

We type the following to also print the time when the person connected (field four):

who | awk ‘{print $ 1, $ 4}’

There are some special field identifiers. These represent the entire line of text and the last field in the text line:

$ 0: Represents the entire line of text.
$ 1: Represents the first field.
$ 2: Represents the second field.
$ 7: Represents the seventh field.
$ 45: Represents the 45th field.
$ NF: Means “number of fields” and represents the last field.

We will type the following to bring up a small text file containing a short quote attributed to Dennis Ritchie:

cat dennis_ritchie.txt

We want awk to print the first, second and last field of the quote. Note that although it is wrapped in the terminal window, it is only a single line of text.

We type the following command:

awk ‘{print $ 1, $ 2, $ NF}’ dennis_ritchie.txt

We do not know this “simplicity”. is the 18th field in the text line, and we don’t care. What we do know is that this is the last field, and we can use $ NF to get its value. The period is simply seen as another character in the body of the field.

Adding output field separators

You can also ask awk to print a special character between fields instead of the default space character. The default date output the command is a bit special because time is plunged right in the middle. However, we can type the following and use awk to extract the fields we want:

Dated
date | awk ‘{print $ 2, $ 3, $ 6}’

We will use the OFS variable (output field separator) to place a separator between the month, the day and the year. Note that below, we put the command in single quotes (‘), not in braces ({}):

date | awk ‘OFS = “https://www.howtogeek.com/” {print 2 $, 3 $, 6 $}’
date | awk ‘OFS = “-” {print $ 2, $ 3, $ 6}’

BEGIN and END rules

A BEGIN rule is executed once before the word processor starts. In fact, it is executed even before awk reads any text. An END rule is executed after all the processing is finished. You can have multiple BEGIN and END rules, and they will run in order.

For our example BEGIN rule, we will print the entire quote from the dennis_ritchie.txt file we used previously with a title above.

To do this, we type this command:

awk ‘BEGIN {print “Dennis Ritchie”} {print $ 0}’ dennis_ritchie.txt

Note that the BEGIN rule has its own set of actions enclosed in its own set of braces ({}).

We can use this same technique with the command we used previously to direct the output from who to awk. To do this, we type the following:

who | awk ‘BEGIN {print “Active Sessions”} {print $ 1, $ 4}’

Input field separators

If you want awk to work with text that does not use spaces to separate fields, you must tell it the character that text uses as a field separator. For example, the / etc / passwd file uses a colon (:) to separate fields.

We will use this file and the -F (separator string) option to tell awk to use the colon (:) as a separator. We type the following to tell awk to print the name of the user account and personal folder:

awk -F: ‘{print $ 1, $ 6}’ / etc / passwd

The output contains the name of the user account (or the name of the application or daemon) and the home folder (or the location of the application).

Adding patterns

If all we care about is regular user accounts, we can include a template with our print action to filter out all other entries. Because User ID numbers are equal to or greater than 1000, we can base our filter on this information.

We type the following to execute our print action only when the third field ($ 3) contains a value of 1000 or more:

awk -F: ‘$ 3> = 1000 {print $ 1, $ 6}’ / etc / passwd

= 1000 {print $ 1, $ 6} ‘/ etc / passwd “command in a terminal window.” width = “646” height = “147” onload = “pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);” onerror = “this.onerror = null; pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);” />

The model must immediately precede the action with which it is associated.

We can use the BEGIN rule to give a title to our little report. We type the following, using the notation ( n) to insert a newline character into the title string:

awk -F: ‘BEGIN {print “User Accounts n ————-“} $ 3> = 1000 {print $ 1, $ 6}’ / etc / passwd

= 1000 {print $ 1, $ 6} ‘/ etc / passwd “command in a terminal window.” width = “646” height = “212” onload = “pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);” onerror = “this.onerror = null; pagespeed.lazyLoadImages.loadIfVisibleAndMaybeBeacon (this);” />

The models are fully fledged regular expressions, And they are one of the glories of awk.

Suppose we want to see the universally unique identifiers (UUID) of mounted file systems. If we search the / etc / fstab file for occurrences of the string “UUID”, it should return this information to us.

We use the search pattern “/ UUID /” in our order:

awk ‘/ UUID / {print $ 0}’ / etc / fstab

It finds all occurrences of “UUID” and prints these lines. In fact, we would have achieved the same result without the print action, because the default action prints the entire line of text. For clarity, however, it is often helpful to be explicit. When browsing through a script or your history file, you’ll be glad you left clues for yourself.

The first line found was a comment line, and although the string “UUID” is in the middle, awk always found it. We can modify the regular expression and tell awk to only process lines starting with “UUID”. To do this, we type the following which includes the start of the line token (^):

awk ‘/ ^ UUID / {print $ 0}’ / etc / fstab

It’s better! Now we only see authentic assembly instructions. To further refine the output, we type the following and limit the display to the first field:

awk ‘/ ^ UUID / {print $ 1}’ / etc / fstab

If we had multiple file systems mounted on this machine, we would get a neat array of their UUIDs.

Integrated functions

awk a many functions that you can call and use in your own programs, From the command line and scripts. If you dig, you will find it very fruitful.

To demonstrate the general technique for calling a function, we will examine some numerics. For example, the following prints the square root of 625:

awk ‘BEGIN {print sqrt (625)}’

This command displays the arctangent of 0 (zero) and -1 (which happens to be the mathematical constant, pi):

awk ‘BEGIN {print atan2 (0, -1)}’

In the following command, we modify the result of the atan2 () function before printing it:

awk ‘BEGIN {print atan2 (0, -1) * 100}’

Functions can accept expressions as parameters. For example, here is a complicated way to ask for the square root of 25:

awk ‘BEGIN {print sqrt ((2 + 3) * 5)}’

awk Scripts

If your command line gets complicated or if you develop a routine that you know you will want to use again, you can pass your awk command into a script.

In our sample script, we will do all of the following:

Tell the shell which executable to use to run the script.
Prepare awk to use the FS field separation variable to read input text with fields separated by a colon (:).
Use the OFS output field separator to tell awk to use a colon (:) to separate fields in the output.
Set a counter to 0 (zero).
Set the second field of each line of text to an empty value (it is always an “x”, so we don’t need to see it).
Print the line with the second field changed.
Increment the counter.
Prints the counter value.

Our script is illustrated below.

Example of an awk script in an editor.

The BEGIN rule performs the preparatory steps, while the END rule displays the counter value. The middle rule (which has no name and no reason to match each line) modifies the second field, prints the line and increments the counter.

The first line of the script tells the shell which executable to use (awk, in our example) to execute the script. It also passes the -f (filename) option to awk, which tells it that the text it will process will come from a file. We will pass the filename to the script when we run it.

We have included the script below as text so you can cut and paste:

#! / usr / bin / awk -f

TO START {
# define the separators of input and output fields
FS = “:”
OFS = “:”
# zero the account counter
accounts = 0
}
{
# set field 2 to nothing
$ 2 = “”
# print the whole line
print $ 0
# count another account
accounts ++
}
END {
# print the results
print accounts “accounts. n”
}

Save it to a file called omit.awk. AT make the script executableeWe type the following using chmod:

chmod + x omit.awk

Now let’s run it and pass the / etc / passwd file to the script. This is the file that awk will process for us, using the rules of the script:

./omit.awk / etc / passwd

The file is processed and each line is displayed, as shown below.

The entries “x” in the second field have been removed, but note that the field separators are still present. The lines are counted and the total is given at the bottom of the output.

awk doesn’t mean awkward

awk does not mean awkward; it represents elegance. He has been described as a treatment filter and a report writer. Specifically, it is both, or rather a tool that you can use for these two tasks. In a few lines, awk achieves what requires complete coding in a traditional language.

This power is exploited by the simple concept of rules which contain models, which select the text to be processed and actions which define the processing.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.