Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Regular expressions of shell scripts (1)

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Overview of regular expressions basic regular expressions extension regular expressions Overview of regular expressions

1. Definition of regular expression

Regular expressions are also known as regular expressions and regular expressions. In code, regular expressions are often abbreviated as regex, regexp or RE;. Regular expressions use a single string to describe and match a series of strings that conform to certain syntactic rules. To put it simply, it is a method of matching strings. Through some special symbols, you can quickly find, delete, and replace a specific string.

Regular expressions are text patterns composed of ordinary characters and metacharacters, which include uppercase and lowercase letters, numbers, punctuation and other symbols, while metacharacters refer to special characters with special meaning in regular expressions. it can be used to define the occurrence pattern of its leading character (that is, the character in front of the metacharacter) in the target object.

Regular expressions are commonly used in scripting and text editors.

two。 Regular expression usage

Regular expressions are very important for system administrators, and a large amount of information will be generated during the operation of the system, some of which are very important and some are just informed information. As a system administrator, if you look at so much information data directly, you can't quickly locate the important information, such as "user account login failure", "service startup failure" and so on. At this point, you can quickly extract "problematic" information through regular expressions. In this way, the operation and maintenance work can become more simple and convenient.

Basic regular expression

The string expression method of regular expression can be divided into basic regular expression and extended regular expression according to different degree of rigor and function. The underlying regular expression is the most basic part of a commonly used regular expression. In the common file processing tools in Linux system, grep and sed support basic regular expressions. To master the use of basic regular expressions, we must first understand the meaning of metacharacters contained in basic regular expressions, which are introduced one by one through the grep command.

1. Example of a basic regular expression

The following operation I copy a httpd configuration file for testing.

[root@localhost ~] # cp / etc/httpd/conf/httpd.conf / opt/httpd.txt [root@localhost ~] # cd / opt [root@localhost opt] # lshttpd.txt rh [root@localhost opt] # cat httpd.txt # # This is the main Apache HTTP server configuration file. It contains the# configuration directives that give the server its instructions.# See for detailed information.# In particular, see # # for a discussion of each configuration directive.## Do NOT simply read the instructions in here without understanding# what they do. They're here only as hints or reminders. If you are unsure# consult the online docs. You have been warned. # # Configuration and logfile names: If the filenames you specify for many# of the server's control files begin with "/" (or "drive:/" for Win32), the# server will use that explicit path. If the filenames do * not* begin# with "/", the value of ServerRoot is prepended-- so 'log/access_log'# with ServerRoot set to' / www' will be interpreted by the# server as'/ www/log/access_log', where as'/ log/access_log' will be# interpreted as'/ log/access_log'.## ServerRoot: The top of the directory tree under which the server's# configuration, error, and log files are kept.## Do not add a slash at the end of the directory path. If you point...// omits part of the content. 1) find specific characters

Use the grep command to find specific characters, where "- n" indicates that the line number is displayed and "- I" indicates case-insensitivity

[root@localhost opt] # grep-n "the" httpd.txt 2 This is the main Apache HTTP server configuration file. It contains the3:# configuration directives that give the server its instructions.9:# Do NOT simply read the instructions in here without understanding10:# what they do. They're here only as hints or reminders. If you are unsure11:# consult the online docs. You have been warned. 13 or # Configuration and logfile names: If the filenames you specify for many14:# of the server's control files begin with "/" (or "drive:/" for Win32), the15:# server will use that explicit path. If the filenames do * not* begin16:# with "/", the value of ServerRoot is prepended-- so 'log/access_log'17:# with ServerRoot set to' / www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's25:# Do not add a slash at the end of the directory path. If you point26:# ServerRoot at a non-local disk, be sure to specify a local disk on the27:# Mutex directive, if file-based mutexes are used. If you wish to share the35:# ports, instead of the default. See also the 47 See also the # To be able to use the functionality of a module which was built as a DSO you48:# have to place corresponding `LoadModule' lines at this location so the49:# directives contained in it are actually available _ before_ they are used.62:# User/Group: The name (or # number) of the user/group to run httpd as.71:# The directives in this section set up the values used by the 'main'74:# any containers you may define later in the file.76:# All of these directives may appear inside containers 77 in which case these default settings will be overridden for the...// omitted part of the content. [root@localhost opt] # grep-ni "the" httpd.txt 2 the # This is the main Apache HTTP server configuration file. It contains the3:# configuration directives that give the server its instructions.9:# Do NOT simply read the instructions in here without understanding10:# what they do. They're here only as hints or reminders. If you are unsure11:# consult the online docs. You have been warned. 13 or # Configuration and logfile names: If the filenames you specify for many14:# of the server's control files begin with "/" (or "drive:/" for Win32), the15:# server will use that explicit path. If the filenames do * not* begin16:# with "/", the value of ServerRoot is prepended-- so 'log/access_log'17:# with ServerRoot set to' / www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's25:# Do not add a slash at the end of the directory path. If you point26:# ServerRoot at a non-local disk, be sure to specify a local disk on the27:# Mutex directive, if file-based mutexes are used. If you wish to share the35:# ports, instead of the default. See also the 47 To be able to use the functionality of a module which was built as a DSO you48:# have to place corresponding `LoadModule' lines at this location so the49:# directives contained init are actually available _ before_ they are used.62:# User/Group: The name (or # number) of the user/group to run httpd as.71:# The directives in this section set up the values used by the 'main'73:# definition. These values also provide defaults for74:# any containers you may define later in the file.76:# All of these directives may appear inside containers,77:# in which case these default settings will be overridden for the82:# ServerAdmin: Your address, where problems with the server should be89:# ServerName gives the name and port that the server uses to identify itself.98:# Deny access to the entirety of your server's filesystem. You must99:# explicitly permit access to web content directories in other...// omits part of the content.

Reverse selection, such as finding lines that do not contain the "the" character, needs to be done through the "- vn" option of the grep command.

[root@localhost opt] # grep-nv "the" httpd.txt 1V 4V # See for detailed information.5:# In particular, see 6V # 7V # for a discussion of each configuration directive.8:#12:#18:# server as'/ www/log/access_log', where as'/ log/access_log' will be19:# interpreted as'/ log/access_log'.20:21:#23:# configuration, error, and log files are kept.24:#28:# same ServerRoot for multiple httpd daemons You will need to change at29:# least PidFile.30:#31:ServerRoot "/ etc/httpd" 32 Listen 33 Listen 34 Dynamic Shared Object # Listen: prevent Apache from glomming onto all bound IP addresses.40:#41:#Listen 12.34.56.78:8042:Listen 8043 44 prevent Apache from glomming onto all bound IP addresses.40:#41:#Listen 12.34.56.78:8042:Listen 8043 Dynamic Shared Object (DSO) Modules (those listed by `httpd-l') do not need51:# to be loaded here...// omits part of the content. 2) use brackets "[]" to find collection characters

Add the strings shirt, short, wd, wod, wood, woooood to the httpd.txt test file. When you look for the strings "shirt" and "short", you can find that both strings contain "sh" and "rt". At this point, execute the following command to find both "shirt" and "short". No matter how many characters there are in "[]", they represent only one character, that is, "[io]" matches "I" or "o".

[root@localhost opt] # vim httpd.txt... / / omit some content... # Supplemental configuration## Load config files in the "/ etc/httpd/conf.d" directory If any.IncludeOptional conf.d/*.confshirtshortwd wod wood Woooood: wq [root@localhost opt] # grep-n'shio [rt' httpd.txt 354:shirt355:short]

To find a duplicate single character "oo", simply execute the following command.

[root@localhost opt] # grep-n 'oo' httpd.txt 16 oo' httpd.txt # with "/", the value of ServerRoot is prepended-- so' log/access_log'17:# with ServerRoot set to'/ www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's26:# ServerRoot at a non-local disk, be sure to specify a local disk on the28:# same ServerRoot for multiple httpd daemons You will need to change at31:ServerRoot "/ etc/httpd" 54 LoadModule foo_module modules/mod_foo.so60:# httpd as root initially and it will switch. 63 The directory out of which you will serve your119:DocumentRoot # It is usually good practice to create a dedicated user and group for86:ServerAdmin root@localhost115:# DocumentRoot: The directory out of which you will serve your119:DocumentRoot "/ var/www/html" 130 Further relax access to the default document root:226: # Redirect permanent / foo http://www.example.com/bar230: # access content that does not live under the DocumentRoot.332:#ErrorDocument 500 "The server made a boo boo." 358:wood359:woooood

If you look for a string that is not preceded by "w" before "oo", you only need to do this by selecting "[^]" in the reverse direction of the collection characters.

[root@localhost opt] # grep-n'[^ w] oo' httpd.txt 16 with "/", the value of ServerRoot is prepended-- so 'log/access_log'17:# with ServerRoot set to' / www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's26:# ServerRoot at a non-local disk, be sure to specify a local disk on the28:# same ServerRoot for multiple httpd daemons You will need to change at31:ServerRoot "/ etc/httpd" 54 LoadModule foo_module modules/mod_foo.so60:# httpd as root initially and it will switch. 63 The directory out of which you will serve your119:DocumentRoot # It is usually good practice to create a dedicated user and group for86:ServerAdmin root@localhost115:# DocumentRoot: The directory out of which you will serve your119:DocumentRoot "/ var/www/html" 130 Further relax access to the default document root:226: # Redirect permanent / foo http://www.example.com/bar230: # access content that does not live under the DocumentRoot.332:#ErrorDocument 500 "The server made a boo boo." 359:woooood

In the execution result of the above command, it is found that "woooood" also conforms to the matching rule, and it can be seen from the above result that the "o" before "oo" conforms to the matching rule. If you don't want lowercase letters in front of "oo", you can use the "grep-n'[^ amurz] oo'httpd.txt" command, where "Amurz" represents lowercase letters and uppercase letters are represented by "Amurz".

[root@localhost opt] # grep-n'[^ a Murz] oo' httpd.txt 1615 # with "/", the value of ServerRoot is prepended-- so 'log/access_log'17:# with ServerRoot set to' / www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's26:# ServerRoot at a non-local disk, be sure to specify a local disk on the28:# same ServerRoot for multiple httpd daemons You will need to change at31:ServerRoot "/ etc/httpd" 115 DocumentRoot: The directory out of which you will serve your119:DocumentRoot "/ var/www/html" 230: # access content that does not live under the DocumentRoot.

Finding rows containing numbers can be done with the "grep-n'[0-9] 'httpd.txt" command.

[root@localhost opt] # grep-n'[0-9] 'httpd.txt 4 grep # See for detailed information.6:# 14 grep # of the server's control files begin with "/" (or "drive:/" for Win32), the41:#Listen 12.34.56.78:8042:Listen 8095:#ServerName www.example.com:80141: # http://httpd.apache.org/docs/2.4/mod/core.html#options311:# interpretation of all content as UTF-8 by default. To use the 312 missing.html334:#ErrorDocument # default browser choice (ISO-8859-1), or to allow the META tags316:AddDefaultCharset UTF-8329:# 1) plain text 2) local redirects3) external redirects332:#ErrorDocument 500 "The server made a boo boo." 333:#ErrorDocument / missing.html334:#ErrorDocument 404 "/ cgi-bin/missing_handler.pl" 335:#ErrorDocument 402 http://www.example.com/subscription_info.html3) find the beginning of the line "^" and the character "$" at the end of the line

The underlying regular expression contains two positioning metacharacters: "^" (the beginning of the line) and "$" (the end of the line). If you want to find a line with the "Ser" string as the beginning of the line, you can do so with the "^" metacharacter.

[root@localhost opt] # grep-n'^ Ser' httpd.txt 31:ServerRoot "/ etc/httpd" 86:ServerAdmin root@localhost

Queries that begin with lowercase letters can be filtered by the "^ [Amurz]" rule, lines that begin with uppercase letters can be filtered using the "^ [Amurz]" rule, and queries that do not begin with letters use the "^ [^ a-zA-Z]" rule. " The ^ "symbol plays a different role inside and outside the metacharacter set" [] "symbol, indicating reverse selection within the" [] "symbol and positioning the beginning of the line outside the" [] "symbol.

[root@localhost opt] # grep-n'^ [a Murz] 'httpd.txt 354:shirt355:short356:wd357:wod358:wood359:woooood [root@localhost opt] # grep-n' ^ [Amurz] 'httpd.txt 31:ServerRoot "/ etc/httpd" 42:Listen 8056:Include conf.modules.d/*.conf66:User apache67:Group apache86:ServerAdmin root@localhost119:DocumentRoot "/ var/www/html" 182:ErrorLog "logs/error_log" 189:LogLevel warn316:AddDefaultCharset UTF-8348: EnableSendfile on353:IncludeOptional conf.d/*.conf [root@localhost opt] # grep-n'^ [^ a-zA-Z] 'httpd.txt 1 grep 2 This is the main Apache HTTP server configuration file. It contains the3:# configuration directives that give the server its instructions.4:# See for detailed information.5:# In particular, see 6:# 7:# for a discussion of each configuration directive.8:#9:# Do NOT simply read the instructions in here without understanding10:# what they do. They're here only as hints or reminders. If you are unsure11:# consult the online docs. You have been warned. ... / / omit part of the content.

If you want to find lines that end with a particular character, you can use the "$" locator. For example, execute the following command to implement a query with a decimal point (.) The line at the end. Because the decimal point (.) is also a metacharacter in regular expressions, you need to use the escape character "\" to convert characters with special meaning into ordinary characters.

.. / / omit part of the content. [root@localhost opt] # grep-n'\. $'httpd.txt 3 httpd.txt # configuration directives that give the server its instructions.4:# See for detailed information.7:# for a discussion of each configuration directive.19:# interpreted as' / log/access_log'.23:# configuration, error, and log files are kept.29:# least PidFile.36:# directive....// omit part of the content.

When querying blank lines, execute the "grep-n'^ $'httpd.txt" command.

[root@localhost opt] # grep-n'^ $'httpd.txt 20 32 httpd.txt 57 68 14 80 87 14 96.. Bash / omit part 4) find any character "." And the repeating character "*"

The decimal point (.) in a regular expression is also a metacharacter that represents any character. For example, execute the following command to find a string of four characters that begins with w and ends with d.

[root@localhost opt] # grep-n'w. D 'httpd.txt 108 httpd.txt # Note that from this point forward you must specifically allow148: # It can be "All", "None", or any combination of the keywords:358:wood

In the above results, the "wood" string "w... d" matches the rule. If you want to query o, oo, ooooo, and so on, you need to use asterisk (*) metacharacters. It is important to note, however, that "*" represents the repetition of zero or more previous single characters. "o *" means to have zero (that is, null characters) or a character greater than or equal to one "o". Because null characters are allowed, executing the "grep-nullified 'httpd.txt" command outputs and prints everything in the text. If it is "oo*", the first o must exist, and the second o must be zero or more o, so all materials that contain o, oo, ooooo, etc., meet the standard. By the same token, if the query contains at least two strings of o or more, execute the command "grep-n characters' httpd.txt".

[root@localhost opt] # grep-n 'oval' httpd.txt...// omitted part... 353:IncludeOptional conf.d/*.conf354:shirt355:short356:wd357:wod358:wood359:woooood [root@localhost opt] # grep-n 'oo*' httpd.txt...// omitted part.. .353: IncludeOptional conf.d/*.conf355:short357:wod358:wood359:woooood [root@localhost opt] # grep-n' ooo*' httpd.txt 16 ooo*' httpd.txt # with "/" The value of ServerRoot is prepended-- so 'log/access_log'17:# with ServerRoot set to' / www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's26:# ServerRoot at a non-local disk, be sure to specify a local disk on the28:# same ServerRoot for multiple httpd daemons, you will need to change at31:ServerRoot "/ etc/httpd" 54 log/access_log'17:# with ServerRoot set to # LoadModule foo_module modules/mod_foo.so60:# httpd as root initially and it will switch. 63 The directory out of which you will serve your119:DocumentRoot # It is usually good practice to create a dedicated user and group for86:ServerAdmin root@localhost115:# DocumentRoot: The directory out of which you will serve your119:DocumentRoot "/ var/www/html" 130 Further relax access to the default document root:226: # Redirect permanent / foo http://www.example.com/bar230: # access content that does not live under the DocumentRoot.332:#ErrorDocument 500 "The server made a boo boo." 358:wood359:woooood

The query begins with a w and ends with a string of at least one o, which can be achieved by executing the following command.

[root@localhost opt] # grep-n 'woo*d' httpd.txt 357:wod358:wood359:woooood

The query begins with a w and ends with a dispensable string of characters in the middle.

[root@localhost opt] # grep-n 'w.tradid' httpd.txt... / / omit part of the content. 342 be turned off when serving from networked-mounted 356:wd357:wod358:wood359:woooood

Query the line of any number

[root@localhost opt] # grep'[0-9] [0-9] * 'httpd.txt # See for detailed information.# # of the server's control files begin with "/" (or "drive:/" for Win32), the#Listen 12.34.56.78:80Listen 80#ServerName www.example.com:80 # http://httpd.apache.org/docs/2.4/mod/core.html#options# interpretation of all content as UTF-8 by default. To use the # default browser choice (ISO-8859-1), or to allow the META tagsAddDefaultCharset UTF-8# 1) plain text 2) local redirects 3) external redirects#ErrorDocument 500 "The server made a boo boo." # ErrorDocument 404 / missing.html#ErrorDocument 404 "/ cgi-bin/missing_handler.pl" # ErrorDocument 402 http://www.example.com/subscription_info.html5) find a continuous character range "{}"

In the above example, we use "." If you want to limit repeating strings within a range, you need to use the limited-range character "{}" in the underlying regular expression. Because "{}" has a special meaning in Shell, you need to use the escape character "\" to convert the "{}" character into a normal character.

(1) query the characters of two o.

[root@localhost opt] # grep-n'o\ {2\} 'httpd.txt 16 with "/", the value of ServerRoot is prepended-- so' log/access_log'17:# with ServerRoot set to'/ www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's26:# ServerRoot at a non-local disk, be sure to specify a local disk on the28:# same ServerRoot for multiple httpd daemons You will need to change at31:ServerRoot "/ etc/httpd" 54 LoadModule foo_module modules/mod_foo.so60:# httpd as root initially and it will switch. 63 The directory out of which you will serve your119:DocumentRoot # It is usually good practice to create a dedicated user and group for86:ServerAdmin root@localhost115:# DocumentRoot: The directory out of which you will serve your119:DocumentRoot "/ var/www/html" 130 Further relax access to the default document root:226: # Redirect permanent / foo http://www.example.com/bar230: # access content that does not live under the DocumentRoot.332:#ErrorDocument 500 "The server made a boo boo." 358:wood359:woooood

(2) the query begins with w and ends with d, with a string of 2'5 o in the middle.

[root@localhost opt] # grep-n'o\ {2jue 5\} 'httpd.txt 16 with "/", the value of ServerRoot is prepended-- so' log/access_log'17:# with ServerRoot set to'/ www' will be interpreted by the22:# ServerRoot: The top of the directory tree under which the server's26:# ServerRoot at a non-local disk, be sure to specify a local disk on the28:# same ServerRoot for multiple httpd daemons You will need to change at31:ServerRoot "/ etc/httpd" 54 LoadModule foo_module modules/mod_foo.so60:# httpd as root initially and it will switch. 63 The directory out of which you will serve your119:DocumentRoot # It is usually good practice to create a dedicated user and group for86:ServerAdmin root@localhost115:# DocumentRoot: The directory out of which you will serve your119:DocumentRoot "/ var/www/html" 130 Further relax access to the default document root:226: # Redirect permanent / foo http://www.example.com/bar230: # access content that does not live under the DocumentRoot.332:#ErrorDocument 500 "The server made a boo boo." 358:wood359:woooood

(3) the query begins with w and ends with d, with strings of more than 2 o in the middle.

[root@localhost opt] # grep-n'wo\ {2\} 'httpd.txt 358:wood359:woooood2. The metacharacter summary metacharacter acts as ^ to match the starting position of the input string. Unless used in a square bracket expression, the character collection is not included. To match the "^" character itself, use "\ ^" $to match the end of the input string. If the Multiline property of the RegExp object is set, "$" also matches'\ n'or'\ r'. To match the "$" character itself, use "\ $". Matches any single character except "\ r\ n" marks the next character as a special character, literal character, backward reference, octal escape character. For example,'n 'matches the character "n". '\ n' matches the newline character. The sequence'\ 'matches "\", while'\ ('matches "(" * matches the previous subexpression zero or more times. To match the "*" character, use the "\ *" [] character collection. Matches any of the characters contained. For example, "[abc]" can match the set of "a" [^] assigned characters in "plain". Matches any character that is not included. For example, "[^ abc]" can match any range of alphabetic [n1-n2] characters in "plin" in "plain". Matches any character in the specified range. For example, "[a Musz]" can match any lowercase character in the range of "a" to "z". Note: the range of characters can be represented only if the hyphen (-) is within the character group and occurs between two characters; if it appears at the beginning of the character group, it can only indicate that the hyphen itself {n} n is a non-negative integer. Match determined n times. For example, "o {2}" does not match the "o" in "Bob", but can match the two o {n,} n in "food" is a non-negative integer, matching at least n times. For example, "o {2,}" does not match "o" in "Bob", but does match all o in "foooood". "o {1,}" is equivalent to "o +". "o {0,}" is equivalent to "o *" {n m} m and n are non-negative integers, where n

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report