Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the use of regular expressions under Linux

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the use of regular expressions under Linux, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let Xiaobian take you to understand.

Preface

Regular expressions are widely used and can be perfectly applied in most programming languages, and they are also of great use in Linux.

Using regular expressions, you can effectively filter out the required text, and then combine the corresponding supported tools or languages to complete the task requirements.

In this blog, we use grep/egrep to complete the call to regular expressions, in fact, we can also use tools such as sed, but the use of sed requires regular expressions, in order to write in the following sed articles, can only be sorted in this way, friends in need can read these two articles together.

Types of regular expressions

Regular expressions can be implemented using the regular expression engine, which is the basic software that interprets regular expression patterns and uses them to match text.

In Linux, the common regular expressions are:

-POSIX basic regular expression (BRE) engine

-POSIX extended regular expression (BRE) engine

Basic use of basic regular expressions

Environmental text preparation

[root@service99 ~] # mkdir / opt/regular [root@service99 ~] # cd / opt/regular [root@service99 regular] # pwd/opt/regular [root@service99 regular] # cp / etc/passwd temp_passwd

Plain text

Plain text can exactly match the corresponding words, and it should be noted that the regular expression pattern is strictly case-sensitive.

/ / grep-- color is mainly able to highlight the matched text, so it is easy to observe the effect [root@service99 regular] # grep-- color "root" temp_passwd root:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin

In a regular expression, it is not limited to the complete word, the defined text appears anywhere in the data stream, and the regular expression will match.

[root@service99 regular] # ifconfig eth2 | grep-- color "add" eth2 Link encap:Ethernet HWaddr 54 color 52grep 01purl 01purl 99pur02 inet addr:192.168.2.99 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::5652:1ff:fe01:9902/64 Scope:Link

Of course, you don't have to be limited to individual words, but you can also have spaces and numbers in the text string.

[root@service99 regular] # echo "This is line number 1" | grep-- color "ber 1" This is line number 1

Special character

There is one problem to be aware of when using text strings in regular expression patterns.

There are several exceptions when defining text strings in regular expressions, which give them special meaning, and if you use these special characters in the text, you may not get the desired results.

Special characters recognized by regular expressions:

The copy code is as follows:

. * [] ^ ${} +? | ()

If you want to use these special characters as normal text characters, you need to escape it, that is, adding a special character before that character, indicating to the regular expression engine that it should interpret the next character as a normal text character.

The special characters that implement this function are: "\" backslash characters

[root@service99 regular] # echo "This cat is $4.99" / / A double quotation mark does not mask special symbols, so the system reads the value of the variable 4.99, which is not available in the current system. It appears empty This cat is. 99 [root@service99 regular] # echo "This cat is\ $4.99" / / escape $This cat is $4.99 [root@service99 regular] # echo 'This cat is\ $4.99' / / single quotation mark masking metacharacter $This cat is\ $4.99 [root@service99 regular] # echo' This cat is $4.99' This cat is $4.99 [root@service99 regular] # cat price.txt This price is $4.99hello [root @ service99 regular] # grep-- color'\ 'price.txt This is "\".

Locator

Start from the beginning

The caret (^) corner defines the pattern that starts at the beginning of the text line in the data stream.

[root@service99 regular] # grep-- color'^ h' price.txt / / the line beginning with the letter h hello,world! [root@service99 regular] # grep-- color'^ $'price.txt / / has no output, because it does not mask the special meaning [root@service99 regular] # grep-- color' ^ $'price.txt / / the line beginning with the $sign $5.00 [root@service99 regular] # echo "This is ^ test. "> > price.txt [root@service99 regular] # cat price.txt This price is $4.99 Hellograd is 5.00 test. [root@service99 regular] # grep-- color'^ 'price.txt / / using it directly will display all the content This price is $4.99 hellograd. This is ^ test. [root@service99 regular] # grep-- color'\ ^ 'price.txt / / is used alone and needs to block This is ^ test at the front. [root@service99 regular] # grep-- when the color'is ^ 'price.txt / / symbol is not at the front, you can use This is ^ test directly without masking.

Find the end

The dollar sign $special character defines the end location, and adding this special character after the text mode indicates that the data line must end in this text mode.

[root@service99 regular] # grep-- color'\. $'price.txt / / ". There is also a special meaning in regular expressions. Please mask it. For more details, please look at This is "\". [root @ service99 regular] # grep-- color'\. $'price.txt / / because I added an extra space when typing, so you need to be careful and careful This is ^ test. / / in regular expressions, spaces are used as character counters. [root@service99 regular] # grep-- color'0 $'price.txt $5.00 [root@service99 regular] # grep-- color' 9 $'price.txt This price is $4.99

Joint positioning

It is more commonly used that "^ $" represents a blank line.

Combined with "^ #", because # stands for comments in Linux

Output a valid configuration of the text

[root@service99 regular] # cat-n / etc/vsftpd/vsftpd.conf | wc-L121 [root@service99 regular] # grep-vE'^ # | ^ $'/ etc/vsftpd/vsftpd.conf / / v indicates inverse selection, E indicates support for extension regularity "|" is the symbol of extension regularity. Looking down, it is followed by anonymous_enable=YESlocal_enable=YESwrite_enable=YESlocal_umask=022anon_upload_enable=YESanon_mkdir_write_enable=YESanon_other_write_enable=YESanon_umask=022dirmessage_enable=YESxferlog_enable=YESconnect_from_port_20=YESxferlog_std_format=YESlisten=YESpam_service_name=vsftpduserlist_enable=YEStcp_wrappers=YES.

Character occurrence range

{nmenthm} / / the previous character appeared n to m times

{n,} / / the previous character appears more than n times

{n} / / the previous character appeared n times

[root@service99 regular] # grep-- color "12345\ {0jue 1\}" price.txt 1234556 [root@service99 regular] # grep-- color "12345\ {0jue 2\}" price.txt 1234556

Dot character

The dot special character is used to match any single character except the newline character, but the dot character must match one character; if there are no characters in the dot position, the pattern matching fails.

[root@service99 regular] # grep-color ".s" price.txt This price is $4.99This is "\" .This is ^ test. [root@service99 regular] # grep-- color ".or" price.txt hello,world!

Character class

A character class can define a class of characters to match a position in a text pattern. If a character in the character class is in the data stream, it matches the pattern.

To define character classes, you need to use square brackets. All characters to be included in the class should be enclosed in square brackets, and then the entire character class should be used in the pattern, just like any other wildcard character.

[root@service99 regular] # grep-- color "[abcdsxyz]" price.txt This price is $4.99 hello.This is ^ test. [root@service99 regular] # grep-color "[sxyz]" price.txt This price is $4.99This is "\" .This is ^ test. [root@service99 regular] # grep-- color "[abcd]" price.txt This price is $4.99 Hellomagnetic world! [root@service99 regular] # grep-- the first character after color "Th [ais]" price.txt / / Th matches This price is $4.99This is "\" .this is ^ test in [ais]. [root@service99 regular] # grep-I-color "th [ais]" price.txt / /-I means case-insensitive This price is $4.99This is "\" .This is ^ test.

If you are not sure about the case of a character, you can use this mode:

[root@service99 regular] # echo "Yes" | grep-- color "[yY] es" [] character order does not affect Yes [root@service99 regular] # echo "yes" | grep-- color "[Yy] es" yes

You can use multiple character classes within a single expression:

[root@service99 regular] # echo "Yes/no" | grep "[Yy] [Ee]" Yes/ no [root @ service99 regular] # echo "Yes/no" | grep "[Yy]. * [Nn]" / * usage in regular expressions, please see Yes/no next.

Character classes also support numbers:

[root@service99 regular] # echo "My phone number is 123456987" | grep-- color "is [1234]" My phone number is 123456987 [root@service99 regular] # echo "This is Phone1" | grep-- color "e [1234]" This is Phone1 [root@service99 regular] # echo "This is Phone1" | grep-color "[1]" This is Phone1

Another very common use of character classes is to parse words that may be misspelled:

[root@service99 regular] # echo "regular" | grep-- color "r [EA] Gua] l [ao]" regular

Negative character class

To find characters that are not in the character class, simply add a delimited character (^) at the beginning of the character class range.

Even with negation, the character class must still match one character.

[root@service99 regular] # cat price.txt This price is $4.99 hellograd $5.00. This is ^ test. Catcar [root@service99 regular] # sed-n'/ [^ t] his/p' price.txt This price is $4.99This is "\". [root @ service99 regular] # grep-- color "[^ t] his" price.txt This price is $4.99This is "\". [root @ service99 regular] # grep-- color "ca [tr]" price.txt catcar [root@service99 regular] # grep-- color "ca [^ r]" price.txt cat

Scope of use

When you need to match a lot of characters and have a certain pattern, you can do this:

[root@service99 regular] # cat price.txt This price is $4.99 hellograd $5.00. This is ^ test. Catcar123455691111806 [root@service99 regular] # egrep-- color'[a Murz] 'price.txt This price is $4.99 HelloJournal worldview this is "\" .this is ^ test. Catcar [root@service99 regular] # egrep-- color'[Amurz] 'price.txt This price is $4.99This is "\". [root @ service99 regular] # grep-- color "[0-9]" price.txt This price is $4.99 $5.00123455691111806 [root@service99 regular] # sed-n' / ^ [^ Amurz] / p'price.txt $5.00 / $123455691111806 [root@service99 regular] # grep-color "^ [^ amurz]" price.txt $5.00 # $123455691111806 [root@service99 regular] # echo $LANG / / when using [Amurz] Pay attention to the value of the LANG environment variable. If the value is modified, pay attention to the validity of the modified value zh_CN.UTF-8 [root@service99 regular] # LANG=en_US.UTF-8

Special character class

Used to match specific types of characters.

[[: blank:]] space (space) and positioning (tab) characters

[[: cntrl:]] control character

[[: graph:]] non-space (nonspace) character

[[: space:]] all white space characters

[[: print:]] displayable characters

[[: xdigit:]] hexadecimal number

[[: punct:]] all punctuation marks

[[: lower:]] lowercase letters

[[: upper:]] capital letters

[[: alpha:]] upper and lowercase letters

[[: digit:]] digit

[[: alnum:]] numbers and uppercase and lowercase letters

Asterisk

Adding an asterisk after a character indicates that the character does not appear or appears multiple times in the text that matches the pattern

[root@service99 regular] # cat test.info goolego go gocome ongoooooooooo [root@service99 regular] # grep-- color "Oo *" test.info goolego go gocome ongoooooooooo [root@service99 regular] # grep-- color "go*" test.info goolego go gogoooooooooo [root@service99 regular] # grep-- color "W. Secretd" price.txt / / often with. Use hello,world together!

Extended regular expression

Question mark

The question mark indicates that the preceding character may not appear or appear once. Does not match repeated characters.

[root@service99 regular] # egrep-- color "91?" Price.txt This price is $4.99911

Plus sign

The plus sign indicates that the preceding character can appear one or more times, but at least once, and if the character does not exist, the pattern does not match.

[root@service99 regular] # egrep-- color "9 +" price.txt This price is $4.99911 [root@service99 regular] # egrep-- color "1 +" price.txt 123455691111806

Use curly braces

Use curly braces to specify restrictions on repeatable regular expressions, often referred to as intervals.

-m: the regular expression appears exactly m times

-mdirection n: the regular expression appears at least m times and n times at most

[root@service99 regular] # echo "This is test,test is file." | egrep-- color "test {0Magne1}" This is test,test is file. [root@service99 regular] # echo "This is test,test is file." | egrep-- color "is {1Jet 2}" This is test,test is file.

Regular expression instance

Here is an example that exercises and examples basic regular expressions.

Because regular expression, just look at the concept or theory is relatively simple, but in practical use, it is not so easy to use, once used, the improvement of efficiency is absolutely considerable.

1. Filter the download file to contain the the keyword

Grep-color "the" regular_express.txt

two。 Filter the download file does not contain the the keyword

Grep-color-vn "the" regular_express.txt

3. Filter the uppercase and lowercase the keywords in the download file

Grep-color-in "the" regular_express.txt

4. Filter the words test or taste

Grep-- color-En 'test | taste' regular_express.txt grep-- color-I "t [ae] ste\ {0jue 1\}" 1.txt

5. Filter bytes with oo

Grep-color "oo" regular_express.txt

6. Filter does not want the g in front of the oo

Grep-color [^ g] "oo" regular_express.txt grep-- color "[^ g] oo" regular_express.txt

7. Do not want lowercase bytes in front of filtering oo

Egrep-- color "[^ a Murz] oo" regular_express.txt

8. Filter the line with the number

Egrep-- color [0-9] regular_express.txt

9. Filter objects that begin with the

Egrep-- color ^ the regular_express.txt

10. Filter those that begin with lowercase letters

Egrep-- color ^ [amurz] regular_express.txt

11. The beginning of the filter is not an English letter.

Egrep-- color ^ [^ amurz] regular_express.txt

twelve。 The filter line ends with a decimal point. That line of work

Egrep-- color $"\." Regular_express.txt

13. Filter blank lines

Egrep-- color "^ $" regular_express.txt

14. Filter out the string of Groupd

Egrep-- color "g... d" regular_express.txt

15. Filter at least two o or more strings

Egrep-color "ooo*" regular_express.txt egrep-color o\ {2,\} regular_express.txt

16. Filter the beginning and end of g, but there is only at least one o between the two g

Egrep-color go\ {1,\} g regular_express.txt

17. Filter rows of any number

Egrep-- color [0-9] regular_express.txt

18. Filter the strings of two o

Egrep-color "oo" regular_express.txt

19. Filter g followed by 2 to 5 o, and then follow the string of g

Egrep-- color go\ {2pr 5\} g regular_express.txt

20. Filter g followed by more than 2 o

Egrep-- color go\ {2,\} regular_express.txt thank you for reading this article carefully. I hope the article "what is the use of regular expressions under Linux" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report