Shell script regular expression one of the three Musketeers (grep,egrep) 07/06 Update SLTechnology News&Howtos

Shell script regular expression one of the three Musketeers (grep,egrep)

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Regular expression of Shell script one. Regular expression one of the three Musketeers: grep

1. Let's take a useless configuration file as a test exercise before learning regular expressions

[root@localhost] # vim chen.txt#version=DEVEL System authorization informationauth-enableshadow-passalgo=sha512# Use CDROM installation mediacdromthethetheTHETHEASDHAS Use graphical installgraphical Run the Setup Agent on first bootfirstboot-enableignoredisk-only-use=sdawoodwdwodwoooooooood1241533234342222222faasd112ZASASDNAshortshirt

two。 Find specific characters

"- vn" reverse selection. Finding lines that do not contain the "the" character needs to be done through the "- vn" option of the grep command.

-n "indicates that the line number is displayed

"- I" means case-insensitive.

After the command is executed, the font color changes to red for characters that meet the matching criteria.

[root@localhost] # grep-n 'the' chen.txt6:thethethe11:# Run the Setup Agent on firstboot [root@localhost] # grep-in 'the' chen.txt6:thethethe7:THE8:THEASDHAS11:# Run the Setup Agent on firstboot [root@localhost] # grep-vn' the' chen.txt1:#version=DEVEL2:# System authorization information3:auth-- enableshadow-- passalgo=sha5124:# Use CDROM installation media5:cdrom7:THE8:THEASDHAS9:# Use graphical install10:graphical12:firstboot-- enable13:ignoredisk-- only- Use=sda14:wood15:wd16:wod17:woooooooood18:12415319:323420:34222222221:faasd1122:223:ZASASDNA24:shortshirt

3. Parentheses "[]" to find collection characters

When you look for the strings "shirt" and "short", you can find that both strings contain "sh" and "rt". At this point, execute the following command to find both "shirt" and "short". No matter how many characters there are in "[]", they represent only one character, that is, "[io]" matches "I" or "o".

[root@localhost ~] # grep-n'shio] rt' chen.txt / / filter io collection character 24:short25:shirt in short or shirt

To find a duplicate single character "oo", simply execute the following command.

[root@localhost ~] # grep-n 'oo' chen.txt 11 oo' chen.txt # Run the Setup Agent on first boot12:firstboot-enable14:wood17:woooooooood

If you look for strings that are not preceded by "w" before "oo", you only need to do this by selecting "[^]" in the reverse direction of the collection characters. For example, executing the "grep-n'[^ w] oo'test.txt" command means looking for strings in test.txt text that are not preceded by "w" before "oo".

[root@localhost ~] # grep-n'[^ w] oo' chen.txt / / filter the string 11oo' chen.txt # Run the Setup Agent on first boot12:firstboot-- enable17:woooooooood of oo starting with w

In the execution results of the above command, it is found that "woood" and "wooooood" also match the matching rules, and both contain "w". In fact, from the execution results, we can see that the characters that meet the matching criteria are shown in bold, and in the above results, we can see that the bold display in "# woood #" is "ooo", and the "o" before "oo" is in line with the matching rules. Similarly, "# woooooood #" also meets the matching rules.

If you don't want lowercase letters in front of "oo", you can use the "grep-n'[^ amurz] oo'test.txt" command, where "Amurz" represents lowercase letters and uppercase letters are represented by "Amurz".

[root@localhost ~] # grep-n'[^ a Murz] oo' chen.txt 19:Foofddd

Finding rows that contain numbers can be done with the "grep-n'[0-9] 'test.txt" command

[root@localhost] # grep-n'[0-9] 'chen.txt

3:auth-enableshadow-passalgo=sha512

20:124153

21:3234

22:342222222

23:faasd11

24:2

Find the beginning of the line "^" and the character "$" at the end of the line

[root@localhost ~] # grep-n'^ the' chen.txt6:thethethe

Rows that start with lowercase letters can be filtered by the "1" rule

[root@localhost] # grep-n'^ [a Murz] 'chen.txt3:auth-- enableshadow-- passalgo=sha5125:cdrom6:thethethe10:graphical12:firstboot-- enable13:ignoredisk-- only-use=sda14:wood15:wd16:wod17:woooooooood18:dfsjdjoooooof23:faasd1126:short27:shirt

Query the beginning of uppercase letters

[root@localhost] # grep-n'^ [Amurz] 'chen.txt7:THE8:THEASDHAS19:Foofddd25:ZASASDNA

If the query does not start with a letter, the "[a-zA-Z]" rule is used.

[root@localhost ~] # grep-n'^ [^ a-zA-Z] 'chen.txt1:#version=DEVEL2:# System authorization information4:# Use CDROM installation media9:# Use graphical install11:# Run the Setup Agent on first boot20:12415321:323422:34222222224:2

The function of the "^" symbol is different inside and outside the metacharacter set "[]" symbol, indicating reverse selection within the "[]" symbol and positioning the beginning of the line outside the "[]" symbol. Conversely, you can use the "$" locator if you want to find a line that ends with a particular character. For example, execute the following command to query rows that end with a decimal point (.). Because the decimal point (.) is also a metacharacter in regular expressions (which will be discussed later), you need to use the escape character "\" to convert characters with special meaning into ordinary characters.

[root@localhost] # grep-n'\. $'chen.txt5:cdrom.6:thethethe.9:# Use graphical install.10:graphical.11:# Run the Setup Agent on first boot.

Execute "grep-n'^ $'chen.txt when querying blank lines

Find any character "." And the repeating character "*"

The decimal point (.) in a regular expression is also a metacharacter that represents any character. For example, execute the following command to find a string of four characters that begins with w and ends with d.

[root@localhost] # grep-n'w.. d' chen.txt14:wood

In the above results, the "wood" string "w... d" matches the rule. If you want to query oo, ooo, ooooo, and so on, you need to use the asterisk () metacharacter. It is important to note, however, that "" represents the repetition of zero or more preceding single characters. "o" means to have zero (that is, null characters) or a character greater than or equal to one "o". Because null characters are allowed, executing the "grep-n'o'test.txt" command outputs and prints everything in the text. If it is "oo", the first o must exist, and the second o must be zero or more o, so all materials that contain o, oo, ooo, ooo, etc., meet the standard. Similarly, if the query contains at least two strings of o or more, execute the "grep-n ooooo 'test.txt" command.

[root@localhost] # grep-n 'ooo*' chen.txt11:# Run the Setup Agent on first boot.12:firstboot-- enable14:wood17:woooooooood18:dfsjdjoooooof19:Foofddd

The query begins with a w and ends with a string of at least one o, which can be achieved by executing the following command.

[root@localhost ~] # grep-n 'woo*d' chen.txt14:wood16:wod17:woooooooood

The query begins with a w and ends with a dispensable string of characters in the middle.

[root@localhost ~] # grep-n'. Progresd' chen.txt14:wood15:wd16:wod17:woooooooood

Query the row of any number.

[root@localhost] # grep-n'[0-9] [0-9] * 'chen.txt3:auth-- enableshadow-- passalgo=sha51220:12415321:323422:34222222223:faasd1124:2

Find continuous character range "{}"

Use "." With "*" to set zero to an infinite number of repeating characters, what if you want to limit repeating strings in a range? For example, if you look for consecutive characters of three to five o, you need to use the bounded character "{}" in the underlying regular expression. Because "{}" has a special meaning in Shell, when using the "{}" character, you need to use the escape character "\" to convert the "{}" character into a normal character.

Query more than two o characters

[root@localhost ~] # grep-n'o\ {2\} 'chen.txt11:# Run the Setup Agent on first boot.12:firstboot-- enable14:wood17:woooooooood18:dfsjdjoooooof19:Foofddd

The query begins with w and ends with d, with a string of 2'5 o in the middle.

[root@localhost ~] # grep-n'wo\ {2pm 5\} d 'chen.txt14:wood

The query begins with w and ends with d, with strings of more than 2 o in the middle.

[root@localhost ~] # grep-n'wo\ {2,\} d'chen.txt14:wood17:woooooooood

two。 Extended regular expression

To simplify the entire instruction, you need to use a wider range of extended regular expressions. For example, use the underlying regular expression to query lines other than the blank line in the file and the line beginning with "#" (usually used to view the configuration file that is in effect), and execute "grep-v'^ KaTeX parse error: Expected group after'^'at position 22: … txt | grep-v'^ configuration #". You need to use a tube here. | ^ # 'test.txt ", where the pipe symbol in single quotation marks indicates or (or).

In addition, the grep command only supports basic regular expressions, and if you use extended regular expressions, you need to use the egrep or awk command. The awk command is explained in a later section, where we use the egrep command directly. The usage of the egrep command is similar to that of the grep command. The egrep command is a search file acquisition pattern that allows you to search for any string and symbol in a file, or for a string of one or more files. A prompt can be a single character, a string, a word, or a sentence.

The metacharacters of common extended regular expressions mainly include the following:

"+" example: execute the command "egrep-n 'wo+d' test.txt" to query strings such as "wood", "woood" and "woooooood".

[root@localhost ~] # egrep-n 'wo+d' chen.txt14:wood16:wod17:woooooooood

"?" Example: execute the command "egrep-n 'bes?t' test.txt" to query the two strings "bet" and "best"

[root@localhost ~] # egrep-n 'bes?t' chen.txt11:best12:bet

"|" example: execute the command "egrep-n'of | is | on' test.txt" to query the string "of" or "if" or "on"

[root@localhost ~] # egrep-n'of | is | on' chen.txt1:#version=DEVEL2:# System authorization information4:# Use CDROM installation media13:# Run the Setup Agent on first boot.15:ignoredisk-- only-use=sda20:dfsjdjoooooof21:Foofddd

"()" example: "egrep-n't (a | e) st' test.txt". "tast" and "test" because the "t" and "st" of these two words are repeated, the "a" and "e" are listed in the "()" symbol and separated by "|" to query the "tast" or "test" string.

[root@localhost ~] # egrep-n't (a | e) st' chen.txt12:test13:tast

"() +" example: "egrep-n'A (xyz) + C 'test.txt". The command begins with "A" and ends with "C", with more than one "xyz" string in the middle.

[root@localhost ~] # egrep-n'A (xyz) + C 'chen.txt14:AxyzxyzxyzC

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.