4.2.1. What is grep?

grep searches the input files for lines containing a match to a given pattern list. When it finds a match in a line, it copies the line to standard output (by default), or whatever other sort of output you have requested with options.

Though grep expects to do the matching on text, it has no limits on input line length other than available memory, and it can match arbitrary characters within a line. If the final byte of an input file is not a newline, grep silently supplies one. Since newline is also a separator for the list of patterns, there is no way to match newline characters in a text.

Some examples:

  1. cathy ~> **grep _root_ /etc/passwd**
  2. root:x:0:0:root:/root:/bin/bash
  3. operator:x:11:0:operator:/root:/sbin/nologin
  4. cathy ~> **grep \-n _root_ /etc/passwd**
  5. 1:root:x:0:0:root:/root:/bin/bash
  6. 12:operator:x:11:0:operator:/root:/sbin/nologin
  7. cathy ~> **grep \-v _bash_ /etc/passwd** | **grep \-v _nologin_**
  8. sync:x:5:0:sync:/sbin:/bin/sync
  9. shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
  10. halt:x:7:0:halt:/sbin:/sbin/halt
  11. news:x:9:13:news:/var/spool/news:
  12. mailnull:x:47:47::/var/spool/mqueue:/dev/null
  13. xfs:x:43:43:X Font Server:/etc/X11/fs:/bin/false
  14. rpc:x:32:32:Portmapper RPC user:/:/bin/false
  15. nscd:x:28:28:NSCD Daemon:/:/bin/false
  16. named:x:25:25:Named:/var/named:/bin/false
  17. squid:x:23:23::/var/spool/squid:/dev/null
  18. ldap:x:55:55:LDAP User:/var/lib/ldap:/bin/false
  19. apache:x:48:48:Apache:/var/www:/bin/false
  20. cathy ~> **grep \-c _false_ /etc/passwd**
  21. 7
  22. cathy ~> **grep \-i _ps_ ~/.bash\*** | **grep \-v _history_**
  23. /home/cathy/.bashrc:PS1="\\\[\\033\[1;44m\\\]$USER is in \\w\\\[\\033\[0m\\\] "

With the first command, user cathy displays the lines from /etc/passwd containing the string root.

Then she displays the line numbers containing this search string.

With the third command she checks which users are not using bash, but accounts with the nologin shell are not displayed.

Then she counts the number of accounts that have /bin/false as the shell.

The last command displays the lines from all the files in her home directory starting with ~/.bash, excluding matches containing the string history, so as to exclude matches from ~/.bashhistory which might contain the same string, in upper or lower cases. Note that the search is for the _string “ps”, and not for the command ps.

Now let’s see what else we can do with grep, using regular expressions.

4.2.2. Grep and regular expressions

| 4.2使用grep - 图1
| If you are not on Linux |
| |

We use GNU grep in these examples, which supports extended regular expressions. GNU grep is the default on Linux systems. If you are working on proprietary systems, check with the -V option which version you are using. GNU grep can be downloaded from http://gnu.org/directory/.

|

4.2.2.1. Line and word anchors

From the previous example, we now exclusively want to display lines starting with the string “root”:

  1. cathy ~> **grep _^root_ /etc/passwd**
  2. root:x:0:0:root:/root:/bin/bash

If we want to see which accounts have no shell assigned whatsoever, we search for lines ending in “:”:

  1. cathy ~> **grep _:$_ /etc/passwd**
  2. news:x:9:13:news:/var/spool/news:

To check that PATH is exported in ~/.bashrc, first select “export” lines and then search for lines starting with the string “PATH”, so as not to display MANPATH and other possible paths:

  1. cathy ~> **grep _export_ ~/.bashrc** | **grep _'\\<PATH'_**
  2. export PATH="/bin:/usr/lib/mh:/lib:/usr/bin:/usr/local/bin:/usr/ucb:/usr/dbin:$PATH"

Similarly, > matches the end of a word.

If you want to find a string that is a separate word (enclosed by spaces), it is better use the -w, as in this example where we are displaying information for the root partition:

  1. cathy ~> **grep \-w _/_ /etc/fstab**
  2. LABEL=/ / ext3 defaults 1 1

If this option is not used, all the lines from the file system table will be displayed.

4.2.2.2. Character classes

A bracket expression is a list of characters enclosed by “[“and”]”. It matches any single character in that list; if the first character of the list is the caret,”^”, then it matches any character NOT in the list. For example, the regular expression”[0123456789]” matches any single digit.

Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale’s collating sequence and character set. For example, in the default C locale,”[a-d]”is equivalent to”[abcd]”. Many locales sort characters in dictionary order, and in these locales”[a-d]”is typically not equivalent to”[abcd]”; it might be equivalent to”[aBbCcDd]”, for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value “C”.

Finally, certain named classes of characters are predefined within bracket expressions. See the grep man or info pages for more information about these predefined expressions.

  1. cathy ~> **grep _\[yf\]_ /etc/group**
  2. sys:x:3:root,bin,adm
  3. tty:x:5:
  4. mail:x:12:mail,postfix
  5. ftp:x:50:
  6. nobody:x:99:
  7. floppy:x:19:
  8. xfs:x:43:
  9. nfsnobody:x:65534:
  10. postfix:x:89:

In the example, all the lines containing either a “y” or “f” character are displayed.

4.2.2.3. Wildcards

Use the “.” for a single character match. If you want to get a list of all five-character English dictionary words starting with “c” and ending in “h” (handy for solving crosswords):

  1. cathy ~> **grep _'\\<c...h\\>'_ /usr/share/dict/words**
  2. catch
  3. clash
  4. cloth
  5. coach
  6. couch
  7. cough
  8. crash
  9. crush

If you want to display lines containing the literal dot character, use the -F option to grep.

For matching multiple characters, use the asterisk. This example selects all words starting with “c” and ending in “h” from the system’s dictionary:

  1. cathy ~> **grep _'\\<c.\*h\\>'_ /usr/share/dict/words**
  2. caliph
  3. cash
  4. catch
  5. cheesecloth
  6. cheetah
  7. --output omitted--

If you want to find the literal asterisk character in a file or output, use single quotes. Cathy in the example below first tries finding the asterisk character in /etc/profile without using quotes, which does not return any lines. Using quotes, output is generated:

  1. cathy ~> **grep _\*_ /etc/profile**
  2. cathy ~> **grep _'\*'_ /etc/profile**
  3. for i in /etc/profile.d/\*.sh ; do

https://tldp.org/LDP/Bash-Beginners-Guide/html/sect_04_02.html#/