In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the use of regular expressions under Linux, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let Xiaobian take you to understand.
Preface
Regular expressions are widely used and can be perfectly applied in most programming languages, and they are also of great use in Linux.
Using regular expressions, you can effectively filter out the required text, and then combine the corresponding supported tools or languages to complete the task requirements.
In this blog, we use grep/egrep to complete the call to regular expressions, in fact, we can also use tools such as sed, but the use of sed requires regular expressions, in order to write in the following sed articles, can only be sorted in this way, friends in need can read these two articles together.
Types of regular expressions
Regular expressions can be implemented using the regular expression engine, which is the basic software that interprets regular expression patterns and uses them to match text.
In Linux, the common regular expressions are:
-POSIX basic regular expression (BRE) engine
-POSIX extended regular expression (BRE) engine
Basic use of basic regular expressions
Environmental text preparation
[root@service99 ~] # mkdir / opt/regular [root@service99 ~] # cd / opt/regular [root@service99 regular] # pwd/opt/regular [root@service99 regular] # cp / etc/passwd temp_passwd
Plain text
Plain text can exactly match the corresponding words, and it should be noted that the regular expression pattern is strictly case-sensitive.
/ / grep-- color is mainly able to highlight the matched text, so it is easy to observe the effect [root@service99 regular] # grep-- color "root" temp_passwd root:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin
In a regular expression, it is not limited to the complete word, the defined text appears anywhere in the data stream, and the regular expression will match.
[root@service99 regular] # ifconfig eth2 | grep-- color "add" eth2 Link encap:Ethernet HWaddr 54 color 52grep 01purl 01purl 99pur02 inet addr:192.168.2.99 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::5652:1ff:fe01:9902/64 Scope:Link
Of course, you don't have to be limited to individual words, but you can also have spaces and numbers in the text string.
[root@service99 regular] # echo "This is line number 1" | grep-- color "ber 1" This is line number 1
Special character
There is one problem to be aware of when using text strings in regular expression patterns.
There are several exceptions when defining text strings in regular expressions, which give them special meaning, and if you use these special characters in the text, you may not get the desired results.
Special characters recognized by regular expressions:
The copy code is as follows:
. * [] ^ ${} +? | ()
If you want to use these special characters as normal text characters, you need to escape it, that is, adding a special character before that character, indicating to the regular expression engine that it should interpret the next character as a normal text character.
The special characters that implement this function are: "\" backslash characters
[root@service99 regular] # echo "This cat is $4.99" / / A double quotation mark does not mask special symbols, so the system reads the value of the variable 4.99, which is not available in the current system. It appears empty This cat is. 99 [root@service99 regular] # echo "This cat is\ $4.99" / / escape $This cat is $4.99 [root@service99 regular] # echo 'This cat is\ $4.99' / / single quotation mark masking metacharacter $This cat is\ $4.99 [root@service99 regular] # echo' This cat is $4.99' This cat is $4.99 [root@service99 regular] # cat price.txt This price is $4.99hello [root @ service99 regular] # grep-- color'\ 'price.txt This is "\".
Locator
Start from the beginning
The caret (^) corner defines the pattern that starts at the beginning of the text line in the data stream.
[root@service99 regular] # grep-- color'^ h' price.txt / / the line beginning with the letter h hello,world! [root@service99 regular] # grep-- color'^ $'price.txt / / has no output, because it does not mask the special meaning [root@service99 regular] # grep-- color' ^ $'price.txt / / the line beginning with the $sign $5.00 [root@service99 regular] # echo "This is ^ test. "> > price.txt [root@service99 regular] # cat price.txt This price is $4.99 Hellograd is 5.00 test. [root@service99 regular] # grep-- color'^ 'price.txt / / using it directly will display all the content This price is $4.99 hellograd. This is ^ test. [root@service99 regular] # grep-- color'\ ^ 'price.txt / / is used alone and needs to block This is ^ test at the front. [root@service99 regular] # grep-- when the color'is ^ 'price.txt / / symbol is not at the front, you can use This is ^ test directly without masking.
Find the end
The dollar sign $special character defines the end location, and adding this special character after the text mode indicates that the data line must end in this text mode.
[root@service99 regular] # grep-- color'\. $'price.txt / / ". There is also a special meaning in regular expressions. Please mask it. For more details, please look at This is "\". [root @ service99 regular] # grep-- color'\. $'price.txt / / because I added an extra space when typing, so you need to be careful and careful This is ^ test. / / in regular expressions, spaces are used as character counters. [root@service99 regular] # grep-- color'0 $'price.txt $5.00 [root@service99 regular] # grep-- color' 9 $'price.txt This price is $4.99
Joint positioning
It is more commonly used that "^ $" represents a blank line.
Combined with "^ #", because # stands for comments in Linux
Output a valid configuration of the text
[root@service99 regular] # cat-n / etc/vsftpd/vsftpd.conf | wc-L121 [root@service99 regular] # grep-vE'^ # | ^ $'/ etc/vsftpd/vsftpd.conf / / v indicates inverse selection, E indicates support for extension regularity "|" is the symbol of extension regularity. Looking down, it is followed by anonymous_enable=YESlocal_enable=YESwrite_enable=YESlocal_umask=022anon_upload_enable=YESanon_mkdir_write_enable=YESanon_other_write_enable=YESanon_umask=022dirmessage_enable=YESxferlog_enable=YESconnect_from_port_20=YESxferlog_std_format=YESlisten=YESpam_service_name=vsftpduserlist_enable=YEStcp_wrappers=YES.
Character occurrence range
{nmenthm} / / the previous character appeared n to m times
{n,} / / the previous character appears more than n times
{n} / / the previous character appeared n times
[root@service99 regular] # grep-- color "12345\ {0jue 1\}" price.txt 1234556 [root@service99 regular] # grep-- color "12345\ {0jue 2\}" price.txt 1234556
Dot character
The dot special character is used to match any single character except the newline character, but the dot character must match one character; if there are no characters in the dot position, the pattern matching fails.
[root@service99 regular] # grep-color ".s" price.txt This price is $4.99This is "\" .This is ^ test. [root@service99 regular] # grep-- color ".or" price.txt hello,world!
Character class
A character class can define a class of characters to match a position in a text pattern. If a character in the character class is in the data stream, it matches the pattern.
To define character classes, you need to use square brackets. All characters to be included in the class should be enclosed in square brackets, and then the entire character class should be used in the pattern, just like any other wildcard character.
[root@service99 regular] # grep-- color "[abcdsxyz]" price.txt This price is $4.99 hello.This is ^ test. [root@service99 regular] # grep-color "[sxyz]" price.txt This price is $4.99This is "\" .This is ^ test. [root@service99 regular] # grep-- color "[abcd]" price.txt This price is $4.99 Hellomagnetic world! [root@service99 regular] # grep-- the first character after color "Th [ais]" price.txt / / Th matches This price is $4.99This is "\" .this is ^ test in [ais]. [root@service99 regular] # grep-I-color "th [ais]" price.txt / /-I means case-insensitive This price is $4.99This is "\" .This is ^ test.
If you are not sure about the case of a character, you can use this mode:
[root@service99 regular] # echo "Yes" | grep-- color "[yY] es" [] character order does not affect Yes [root@service99 regular] # echo "yes" | grep-- color "[Yy] es" yes
You can use multiple character classes within a single expression:
[root@service99 regular] # echo "Yes/no" | grep "[Yy] [Ee]" Yes/ no [root @ service99 regular] # echo "Yes/no" | grep "[Yy]. * [Nn]" / * usage in regular expressions, please see Yes/no next.
Character classes also support numbers:
[root@service99 regular] # echo "My phone number is 123456987" | grep-- color "is [1234]" My phone number is 123456987 [root@service99 regular] # echo "This is Phone1" | grep-- color "e [1234]" This is Phone1 [root@service99 regular] # echo "This is Phone1" | grep-color "[1]" This is Phone1
Another very common use of character classes is to parse words that may be misspelled:
[root@service99 regular] # echo "regular" | grep-- color "r [EA] Gua] l [ao]" regular
Negative character class
To find characters that are not in the character class, simply add a delimited character (^) at the beginning of the character class range.
Even with negation, the character class must still match one character.
[root@service99 regular] # cat price.txt This price is $4.99 hellograd $5.00. This is ^ test. Catcar [root@service99 regular] # sed-n'/ [^ t] his/p' price.txt This price is $4.99This is "\". [root @ service99 regular] # grep-- color "[^ t] his" price.txt This price is $4.99This is "\". [root @ service99 regular] # grep-- color "ca [tr]" price.txt catcar [root@service99 regular] # grep-- color "ca [^ r]" price.txt cat
Scope of use
When you need to match a lot of characters and have a certain pattern, you can do this:
[root@service99 regular] # cat price.txt This price is $4.99 hellograd $5.00. This is ^ test. Catcar123455691111806 [root@service99 regular] # egrep-- color'[a Murz] 'price.txt This price is $4.99 HelloJournal worldview this is "\" .this is ^ test. Catcar [root@service99 regular] # egrep-- color'[Amurz] 'price.txt This price is $4.99This is "\". [root @ service99 regular] # grep-- color "[0-9]" price.txt This price is $4.99 $5.00123455691111806 [root@service99 regular] # sed-n' / ^ [^ Amurz] / p'price.txt $5.00 / $123455691111806 [root@service99 regular] # grep-color "^ [^ amurz]" price.txt $5.00 # $123455691111806 [root@service99 regular] # echo $LANG / / when using [Amurz] Pay attention to the value of the LANG environment variable. If the value is modified, pay attention to the validity of the modified value zh_CN.UTF-8 [root@service99 regular] # LANG=en_US.UTF-8
Special character class
Used to match specific types of characters.
[[: blank:]] space (space) and positioning (tab) characters
[[: cntrl:]] control character
[[: graph:]] non-space (nonspace) character
[[: space:]] all white space characters
[[: print:]] displayable characters
[[: xdigit:]] hexadecimal number
[[: punct:]] all punctuation marks
[[: lower:]] lowercase letters
[[: upper:]] capital letters
[[: alpha:]] upper and lowercase letters
[[: digit:]] digit
[[: alnum:]] numbers and uppercase and lowercase letters
Asterisk
Adding an asterisk after a character indicates that the character does not appear or appears multiple times in the text that matches the pattern
[root@service99 regular] # cat test.info goolego go gocome ongoooooooooo [root@service99 regular] # grep-- color "Oo *" test.info goolego go gocome ongoooooooooo [root@service99 regular] # grep-- color "go*" test.info goolego go gogoooooooooo [root@service99 regular] # grep-- color "W. Secretd" price.txt / / often with. Use hello,world together!
Extended regular expression
Question mark
The question mark indicates that the preceding character may not appear or appear once. Does not match repeated characters.
[root@service99 regular] # egrep-- color "91?" Price.txt This price is $4.99911
Plus sign
The plus sign indicates that the preceding character can appear one or more times, but at least once, and if the character does not exist, the pattern does not match.
[root@service99 regular] # egrep-- color "9 +" price.txt This price is $4.99911 [root@service99 regular] # egrep-- color "1 +" price.txt 123455691111806
Use curly braces
Use curly braces to specify restrictions on repeatable regular expressions, often referred to as intervals.
-m: the regular expression appears exactly m times
-mdirection n: the regular expression appears at least m times and n times at most
[root@service99 regular] # echo "This is test,test is file." | egrep-- color "test {0Magne1}" This is test,test is file. [root@service99 regular] # echo "This is test,test is file." | egrep-- color "is {1Jet 2}" This is test,test is file.
Regular expression instance
Here is an example that exercises and examples basic regular expressions.
Because regular expression, just look at the concept or theory is relatively simple, but in practical use, it is not so easy to use, once used, the improvement of efficiency is absolutely considerable.
1. Filter the download file to contain the the keyword
Grep-color "the" regular_express.txt
two。 Filter the download file does not contain the the keyword
Grep-color-vn "the" regular_express.txt
3. Filter the uppercase and lowercase the keywords in the download file
Grep-color-in "the" regular_express.txt
4. Filter the words test or taste
Grep-- color-En 'test | taste' regular_express.txt grep-- color-I "t [ae] ste\ {0jue 1\}" 1.txt
5. Filter bytes with oo
Grep-color "oo" regular_express.txt
6. Filter does not want the g in front of the oo
Grep-color [^ g] "oo" regular_express.txt grep-- color "[^ g] oo" regular_express.txt
7. Do not want lowercase bytes in front of filtering oo
Egrep-- color "[^ a Murz] oo" regular_express.txt
8. Filter the line with the number
Egrep-- color [0-9] regular_express.txt
9. Filter objects that begin with the
Egrep-- color ^ the regular_express.txt
10. Filter those that begin with lowercase letters
Egrep-- color ^ [amurz] regular_express.txt
11. The beginning of the filter is not an English letter.
Egrep-- color ^ [^ amurz] regular_express.txt
twelve。 The filter line ends with a decimal point. That line of work
Egrep-- color $"\." Regular_express.txt
13. Filter blank lines
Egrep-- color "^ $" regular_express.txt
14. Filter out the string of Groupd
Egrep-- color "g... d" regular_express.txt
15. Filter at least two o or more strings
Egrep-color "ooo*" regular_express.txt egrep-color o\ {2,\} regular_express.txt
16. Filter the beginning and end of g, but there is only at least one o between the two g
Egrep-color go\ {1,\} g regular_express.txt
17. Filter rows of any number
Egrep-- color [0-9] regular_express.txt
18. Filter the strings of two o
Egrep-color "oo" regular_express.txt
19. Filter g followed by 2 to 5 o, and then follow the string of g
Egrep-- color go\ {2pr 5\} g regular_express.txt
20. Filter g followed by more than 2 o
Egrep-- color go\ {2,\} regular_express.txt thank you for reading this article carefully. I hope the article "what is the use of regular expressions under Linux" shared by the editor will be helpful to you. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.