The Linux grep command is a string and pattern matching utility that displays matching lines from multiple files. It also works with channeled outlet from other commands. We show you how.
The story behind grep
The grep command is famous on Linux and Unix circles for three reasons. First, it is extremely useful. Second, the the wealth of options can be overwhelming. Third, it was written overnight to meet a particular need. The first two are in full swing; the third is slightly off.
Ken thompson had extracted the regular expression search capabilities of the ed editor (pronounced ee-dee) and created a small program – for its own use – to search text files. His department head at Bell Labs, Doug Mcilroy, approached Thompson and described the problem one of his colleagues, Lee McMahon, faced.
McMahon was trying to identify the perpetrators of the Federalist papers by textual analysis. He needed a tool capable of finding phrases and strings in text files. Thompson spent about an hour that night turning his tool into a general utility that could be used by others and renamed it grep. It took the name of the command string ed g / re / p, which translates to “global regular expression search”.
Simple searches with grep
To search for a string in a file, pass the search term and filename on the command line:
The corresponding lines are displayed. In this case, it is a single line. The corresponding text is highlighted. Indeed, on most distributions, grep has the alias:
alias grep = ‘grep –colour = auto’
Let’s examine the results when multiple lines match. We are looking for the word “Average” in an application log file. Because we cannot remember if the word is lowercase in the log file, we will use the -i option (ignore case):
grep -i Average geek-1.log
Each corresponding line is displayed, with the corresponding text highlighted in each.
We can display non-matching lines using the -v option (reverse match).
grep -v Mem geek-1.log
There is no highlighting because these are the lines that do not match.
We can make grep completely silent. The result is passed to the shell as a return value from grep. A result of zero means that the string was found and a result of one means that it was not found. We can verify the return code using the $? special settings:
grep -q medium geek-1.log
grep -q howtogeek geek-1.log
Recursive searches with grep
To search the nested directories and subdirectories, use the -r (recursive) option. Note that you do not provide a filename on the command line, you must provide a path. Here we are looking in the current directory “.” And in all the sub-directories:
grep -r -i memfree.
The output includes the directory and file name for each corresponding line.
We can make grep follow symbolic links using the -R (recursive dereference) option. We have a symbolic link in this directory, called logs-folder. It points to / home / dave / logs.
ls -l logbook
Repeat our last search with the -R option (recursive dereference):
grep -R -i memfree.
The symbolic link is followed and the directory to which it points is also sought by grep.
Whole word search
By default, grep will match a line if the search target appears anywhere in that line, including inside another string. Look at this example. We are going to search for the word “free”.
grep -i free geek-1.log
The results are lines that contain the string “free”, but they are not separate words. They are part of the “MemFree” chain.
To force grep to match separate “words” only, use the -w (regular word expression) option.
grep -w -i free geek-1.log
This time there is no result because the search term “free” does not appear in the file as a separate word.
Using multiple search terms
The -E (extended regexp) option allows you to search for multiple words. (The -E option replaces the depreciated egrep version of grep.)
This command searches for two search terms, “medium” and “no memory.”
grep -E -w -i “average | memfree” geek-1.log
All the corresponding lines are displayed for each of the search terms.
You can also search for several terms that are not necessarily whole words, but they can also be whole words.
The -e (templates) option allows you to use multiple search terms on the command line. We use the regular expression parenthesis function to create a search pattern. It tells grep to match one of the characters in square brackets “. “This means that grep will match” kB “or” KB “when searching.
The two strings are matched and, in fact, some lines contain the two strings.
Exact line matching
-X (regular line expression) will only match lines where the entire line matches the search term. Let’s look for a timestamp which, we know, only appears once in the log file:
grep -x “20-Jan – 06 15:24:35” geek-1.log
The only line that matches is found and displayed.
The opposite shows only the lines that do not match. This can be useful when viewing configuration files. The comments are excellent, but it is sometimes difficult to spot the real parameters among all. Here is the / etc / sudoers file:
We can effectively filter comment lines like this:
sudo grep -v “#” / etc / sudoers
It is much easier to analyze.
Show only matching text
It may happen that you do not want to see the entire corresponding line, just the corresponding text. The -o option (matching only) does just that.
grep -o MemFree geek-1.log
The display is reduced to display only the text that matches the search term, instead of the entire corresponding line.
Count with grep
grep is not just a question of text, it can also provide digital information. We can make grep count for us in different ways. If we want to know how many times a search term appears in a file, we can use the -c (count) option.
grep -c geek-1.log medium
grep reports that the search term appears 240 times in this file.
You can have grep display the line number for each corresponding line using the -n (line number) option.
grep -n Jan geek-1.log
The line number for each corresponding line is displayed at the start of the line.
To reduce the number of results displayed, use the -m option (max number). We will limit the output to five corresponding lines:
grep -m5 -n Jan geek-1.log
It is often useful to see additional lines, possibly non-corresponding lines, for each corresponding line. this can help to distinguish which of the corresponding lines are the ones that interest you.
To display certain lines after the corresponding line, use the -A option (after context). We are asking for three lines in this example:
grep -A 3 -x “20-Jan-06 15:24:35” geek-1.log
To see certain lines before the corresponding line, use the -B option (context before).
grep -B 3 -x “20-Jan-06 15:24:35” geek-1.log
And to include lines before and after the corresponding line, use the -C option (context).
grep -C 3 -x “20-Jan-06 15:24:35” geek-1.log
Display of corresponding files
To see the names of the files containing the search term, use the -l option (matched files). To find out which C source code files contain references to the sl.h header file, use this command:
grep -l “sl.h” * .c
File names are listed, not the corresponding lines.
And of course, we can search for files that do not contain the search term. The -L option (unmatched files) does just that.
grep -L “sl.h” * .c
Start and end of lines
We can force grep to display only the matches that are at the start or end of a line. The regular expression operator “^” corresponds to the start of a line. Almost all lines in the log file will contain spaces, but we will search for lines that have space as the first character:
grep “^” geek-1.log
Lines that have a space as the first character – at the beginning of the line – are displayed.
To match the end of the line, use the regular expression operator “$”. We will search for lines that end with “00”.
grep “00 $” geek-1.log
The screen displays lines that have “00” as the final characters.
Using pipes with grep
Of course, you can direct the input to grep, direct the output of grep to another program, and have grep tucked away in the middle of a chain of pipes.
Suppose we want to see all occurrences of the string “ExtractParameters” in our C source code files. We know there will be quite a few, so we redirect the output minus:
grep “ExtractParameters” * .c | less
The output is shown minus.
This allows you to browse the list of files and use the search function for less.
If we direct the output from grep to wc and use the -l option (lines), we can count the number of lines in source code files that contain “ExtractParameters”. (We could do this using the grep -c (count) option, but it’s a good way to demonstrate pipeline outside of grep.)
grep “ExtractParameters” * .c | wc -l
With the following command, we channel the output of ls to grep and the channelization of the output of grep to sort. We list the files in the current directory, selecting those that contain the string “Aug”, and sort them by file size:
ls -l | grep “august” | sort + 4n
Let’s break it down:
ls -l: Make a long format list of files using ls.
grep “august”: Select the lines of the list ls which contain “August”. Note that this would also find files whose name contains “August”.
sort + 4n: Sort the output of grep on the fourth column (file size).
We get a sorted list of all files changed in August (whatever the year), in ascending order of file size.
RELATED: How to use pipes on Linux
grep: less an order, more than one ally
grep is a great tool at your disposal. It dates from 1974 and continues to operate because we need what it does and nothing does it better.
Grep coupling with some regular expressions-fu really takes it to the next level.