Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Grep and regular expression of Linun text search

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

There are three kinds of text processing tools commonly used in GNU/Linux:

Grep, egrep, fgrep: text search tools

Sed: stream editor, text editing tool

Awk: text report generator

Today, let's mainly talk about grep and regular expressions.

I. Overview

Grep (Global search REgular expression and Print out the line.), a text search tool that matches the target text line by line according to the specified "pattern (filter criteria)" and prints out lines that match the criteria.

What is a regular expression?

Regular expression: a pattern written by a class of special characters and text characters, some of which are not literal, but are used to represent control or wildcard functions.

Regular expressions fall into two categories:

Basic regular expression: BRE

Extended regular expression: ERE

Regular expression engine: a program that parses a given text using regular expression patterns.

Grep family:

Grep: basic regular expressions are supported (you can also use an option "- E" to support extended regular expressions; we'll talk about it later)

Egrep: supports the use of extensions

Regular expression

Fgrep: regular expressions are not supported

Grep command

Grep [option] [Mode] [FILE...]

Examples of common options:

-I: ignore character case

[root@localhost~] # cat test.txtHe love his lover.he like his lover.he love his liker.He like his liker. [root@localhost~] # grep-I "he" test.txt He lovehis lover.he likehis lover.he lovehis liker.He likehis liker.

-n: displays the line number

[root@localhost] # grep-ni "he" test.txt 1:He love his lover. 2:he like his lover. 3:he love his liker. 4:He like his liker.

-o: only the matching text itself is displayed

[root@localhost ~] # grep "bash$" / etc/passwd root:x:0:0:root:/root:/bin/bash mageedu:x:1000:1000:mageedu:/home/mageedu:/bin/bash gentoo:x:3003:3003::/home/gentoo:/bin/bash ubuntu:x:3004:4004::/home/ubuntu:/bin/bash mint:x:3100:3100::/home/mint:/bin/bash hadoop: X:3102:3102::/home/hadoop:/bin/bash hive:x:991:986::/home/hive:/bin/bash linxuejing:x:3103:3103::/home/linxuejing:/bin/bash [root@localhost ~] # grep-o "bash$" / etc/passwd bash Note: here is the first line that ends with bash Then add the "- o" option to display only the match to the string itself

-v:--invert-match: inversion, inverse matching

[root@localhost ~] # grep-v "bash$" / etc/passwd bin:x:1:1:bin:/bin:/sbin/nologin daemon:x:2:2:daemon:/sbin:/sbin/nologin adm:x:3:4:adm:/var/adm:/sbin/nologin lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin sync:x:5:0:sync:/sbin: / bin/sync shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown halt:x:7:0:halt:/sbin:/sbin/halt mail:x:8:12:mail:/var/spool/mail:/sbin/nologin. Note: display lines that do not end with "bash"

-- color=auto: highlight the matching text after coloring (in this case, CentOS 7)

# alias alias cp='cp-i' alias egrep='egrep-- color=auto' alias fgrep='fgrep-- color=auto' alias grep='grep-- color=auto' alias l.='ls-d. *-- color=auto' alias ll='ls-l-- color=auto' alias ls='ls-- color=auto' alias mv='mv-i' alias rm='rm-i' alias Which='alias | / usr/bin/which--tty-only-- read-alias-- show-dot-- show-tilde' Note: if CentOS 5 is 6, you need to define it manually with alias grep='grep-- color=auto'.

-QMAT: silent mode, does not output any information

Root@localhost ~] # grep-Q "root" / etc/passwd [root@localhost ~] # echo $? 0 can use "$?" To display the execution status result of the previous command. "0" indicates success and matches to the string.

-e PATTERN: multiple pattern matching

[root@localhost~] # grep-e "r.. t"-e "b.. h" / etc/passwdroot:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologinftp:x:14:50:FTPUser:/var/ftp:/sbin/nologinchrony:x:993:990::/var/lib/chrony:/sbin/nologinmageedu:x:1000:1000:mageedu:/home/mageedu:/bin/bashgentoo:x:3003 : 3003::/home/gentoo:/bin/bashubuntu:x:3004:4004::/home/ubuntu:/bin/bashmint:x:3100:3100::/home/mint:/bin/bashhadoop:x:3102:3102::/home/hadoop:/bin/bashhive:x:991:986::/home/hive:/bin/bashlinxuejing:x:3103:3103::/home/linxuejing:/bin/bash

-f FILE:FILE contains a text file of pattern for each line, and grep script

[root@localhost~] # vim grep.txtrootbash

-A num: displays the last n lines of the matched row

[root@localhost ~] # grep-A 1 "hadoop" / etc/passwd hadoop:x:3102:3102::/home/hadoop:/bin/bash hive:x:991:986::/home/hive:/bin/bash

-B num: displays the first n lines of the matched row

[root@localhost ~] # grep-B1 "hadoop" / etc/passwd fedora:x:3101:3101::/home/fedora:/bin/csh hadoop:x:3102:3102::/home/hadoop:/bin/bash

-C num: displays the n lines before and after matching to the line

[root@localhost ~] # grep-C 2 "hadoop" / etc/passwd mint:x:3100:3100::/home/mint:/bin/bash fedora:x:3101:3101::/home/fedora:/bin/csh hadoop:x:3102:3102::/home/hadoop:/bin/bash hive:x:991:986::/home/hive:/bin/bash linxuejing:x:3103:3103::/home/linxuejing:/bin/bash

-E: equivalent to "egrep", a regular expression that supports extensions (as we'll see below)

-F: fixed strings are supported, but regular expressions are not supported, which is equivalent to fgrep

Basic regular expression metacharacters:

(1) character matching

.: match any single character

[]: match any single character in the range

[^]: matches any single character outside the range

[amurz]: all lowercase letters

[Amurz]: all uppercase letters

[0-9]: all numbers

[: digit:]: all the numbers

[: lower:]: all lowercase letters

[: upper:]: all capital letters

[: alpha:]: any letter (including upper and lower case)

[: alnum:]: all letters and numbers

[: space:]: White space character

[: punct:]: punctuation

[: blank:]: spaces and Tab key

[: cntrl:]: all control characters (Ctrl+#)

[: print:]: all printable characters

Can be viewed using man 7 glob

Example:

.: find lines in the / etc/passwd file that start with "r" and end with "t" followed by two arbitrary single characters

[root@localhost ~] # grep "r.. t" / etc/passwd root:x:0:0:root:/root:/bin/bash operator:x:11:0:operator:/root:/sbin/nologin ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

[]: find lines in the / tmp/meminfo file that begin with S or s

[root@localhost ~] # grep "^ [Ss]" / tmp/meminfo SwapCached: 0 kB SwapTotal: 2098172 kB SwapFree: 2098172 kB Shmem: 13128 kB Slab: 234936 kB SReclaimable: 193688 kB SUnreclaim: 41248 kB swap

[amurz]: represents a case-insensitive line with two characters after "I"

[root@localhost ~] # ifconfig | grep-I "I [amerz] [amerz]" eno16777736:flags=4163 mtu1500 inet 192.168.254.130 netmask 255.255.255.0 broadcast 192.168.254.255 inet6 fe80::20c:29ff:fe8e:eb7c prefixlen64 scopeid 0x20 TX errors 0 dropped 0 overruns0 carrier0 collisions0lo: flags=73 mtu65536 inet 127.0.0.1 netmask 255.0.0.0 inet6::1 prefixlen128 scopeid 0x10 TX errors 0 dropped 0 overruns0 carrier0 collisions0

[[: alpha:]]: find lines in the file that begin with any letter

[root@localhost ~] # grep "^ [[: alpha:]]" / tmp/grub2.cfgset pager=1LHif [- s $prefix/grubenv]; thenfiif ["${next_entry}"]; thenelsefiif [x "${feature_menuentry_id}" = xy]; thenelsefi.

(2) match the number of times:

*: match the characters in front of them any time (including 0 times)

[root@localhost ~] # grep "Xeroy" test2.txt abcy yabcd xxxyabc abgdsfy asdy

\ +: match the characters before them at least once

[root@localhost ~] # grep "x\ + y" test2.txt xxxyabc

. *: any character of any length

[root@localhost ~] # grep "x.roomy" test2.txt xxxyabc xjgbdfg,n,gnjgy9

\?: match the preceding characters 0 or 1 times (at most), and the preceding characters are optional.

[root@localhost~] # grep "x\? y" test2.txt abcy yabcd xxxyabc abgdsfy asdy xjgbdfg,n,gnjgy

\ {m\}: the preceding character appears exactly m times, and m is a non-negative integer

Root@localhost~] # grep "x\ {2\}" test2.txt xxxyabc

\ {mdirection n\}: the preceding characters appear at least m times and n times at most

\ {m,\}: at least m times, no limit

\ {0magnetic n\}: n times at most

[root@localhost~] # grep "x\ {1Magne3\} y" test2.txt xxxyabc [root@localhost~] # grep "x\ {3,\} y" test2.txt xxxyabc# shows 3 characters after "I" [root@localhost~] # ifconfig | grep "I [[: alpha:]]\ {3\}" inet 192.168.254.130 netmask255.255.255.0 broadcast 192.168.254.255 inet6 fe80::20c:29ff:fe8e:eb7c prefixlen 64 scopeid0x20 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 inet 127.0.0.1 netmask 255.0.0.0 inet6:: 1 prefixlen 128 scopeid 0x10 TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

(3) position anchoring

Restrict the use of patterns to search for text, and limit where the text matched by the pattern can only appear in the target text

^: anchor at the beginning of the line; for the leftmost side of the pattern, ^ PATTERN

[root@localhost~] # grep "^ root" / etc/passwdroot:x:0:0:root:/root:/bin/bash# displays lines beginning with the root string

$: anchor at the end of the line; on the far right of user mode, PATTERN$

^ PATTERN$: entire line matching

^ $: blank line; no blank characters and Tab keys

^ [[: space:]] * $: find lines with only white space characters; count how many lines there are

[root@localhost~] # grep "^ [[: space:]] * $" / etc/grub2.cfg | wc-L17

Word Anchor:\

Words: consecutive characters made up of non-special characters are called words

[root@localhost ~] # cat test3.txtrootrootlsrootnkllopenroot [root@localhost ~] # grep "\" test3.txtroot

\, PATTERN\ b

[root@localhost~] # grep "bash\ >" / etc/passwdroot:x:0:0:root:/root:/bin/bashmageedu:x:1000:1000:mageedu:/home/mageedu:/bin/bashgentoo:x:3003:3003::/home/gentoo:/bin/bashubuntu:x:3004:4004::/home/ubuntu:/bin/bashmint:x:3100:3100::/home/mint:/bin/bashhadoop:x:3102:3102::/home/hadoop:/bin/bashhive: X:991:986::/home/hive:/bin/bashlinxuejing:x:3103:3103::/home/linxuejing:/bin/bash

Exercise:

Find the line ending with "LISTEN" followed by 0 or more white space characters in the results of the "netstat-tan" command

Find the file path in the result of the "ldd / usr/bin/ls" command

Look for lines in the / etc/grub2.cfg file that start with at least one white space character, followed by a non-white space character

(4) grouping and citation

\ (PATTERN\): treat the characters matched by the secondary PATTERN as a whole

Note: the characters matched by the patterns in group parentheses are automatically recorded by the regular expression engine in internal variables, which are\ 1,\ 2,\ 3.

\ n: the result of the pattern matching between the nth left parenthesis and the matching right parenthesis in the pattern

Example:

Backward reference: refers to the string to which the pattern in previous parentheses is matched

\ 1 means to reference the result of the previous pattern match followed by an "r" line

Second, extended regular expressions

Egrep: supports the use of extended regular expressions, equivalent to "grep-E"

Extend the metacharacters of regular expressions: most of them are the same as above, here are examples to illustrate the differences

Character matching:

.: any single character

[]: any single character in the range

[^]: any single character outside the range

Number of times match:

*: any time

? 0 times or one time

Root@localhost~] # egrep "Xeroy" test2.txtabcyyabcdxxxxxxxxyabcabgdsfyasdyxjgbdfg,n,gnjgy

+: matches the character before it one or more times

[root@localhost~] # egrep "Xeroy" test2.txtxxxxxxxxyabc

{m}: exact match m times

[root@localhost~] # egrep "x {5} y" test2.txtxxxxxxxxyabc# exactly matches the previous character 5 times

{mdirection n}: at least m times, at most n times

[root@localhost~] # egrep "x {3jue 6} y" test2.txtxxxxxxxxyabc# matches the preceding characters at least 3 times, up to 6 times

Position Anchor:

^: beginning of the line

$: end of Lin

\,\ b: suffix

Grouping and citation

(pattern): grouped, characters matched by patterns in parentheses are recorded in variables within the regular expression engine

Or: it means C or cat; ls or LS

The first time to write a blog, the writing is not very good, please give me a lot of advice.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report