How to use grep and regular expressions in linux 07/03 Update SLTechnology News&Howtos

How to use grep and regular expressions in linux

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces how to use grep and regular expressions in linux. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

Grep (abbreviated from Globally search a Regular Expression and Print) is a powerful text search tool that searches for text using specific pattern matches, including regular expressions, and outputs matching lines by default. Unix's grep family includes grep, egrep and fgrep. Similar to the command FINDSTR in Windows system.

Grep egrep fgrep (regular expressions are not supported)

Grep requires standard input, so it is often located on the right side of the pipe

Command parameters:

-- color=auto: shades the matched text

-v: displays lines that are not matched by pattern

-I: ignore character case

-n: displays the matching line number

-c: count the number of matching rows

-o: only the matching strings are displayed

-Q: silent mode, no information output

-A #: after, the last # lines

-B #: before, the first # lines

-C #: context, # lines before and after

-e: implement logical or relationships between multiple options

Grep-e 'cat'-e 'dog' file

-w: matches the whole word, numbers and underscores are all part of the word, and the rest are the separators of the word

-E: equivalent to egrep

-F: equivalent to fgrep, regular expressions are not supported

-f: content retrieval with a file (written with different characters) is a logical or relationship

Exercise:

1. Display the UID and default shell of three users: root, centos, and arch (users need to create their own)

2. Find the line in the / etc/rc.d/init.d/functions file that begins with a word (including an underscore) followed by a parenthesis

3. Use egrep to extract its base name from / etc/rc.d/init.d/functions

4. Use egrep to retrieve the directory name of the above path

5. Count the number of times each host logged in with root IP address in the last command

6. The extended regular expressions are used to represent 0-9, 10-99, 100-199, 200-249, 250-255 respectively.

0-9: [0-9]

10-99: [1-9] [0-9]

100-199: 1 [0-9] [0-9]

200-249: 2 [0-5] [0-9]

250-255: 25 [0-5]

7. Display all IPv4 addresses in the results of the ifconfig command

8. Sort each character in welcome to centos linux, with the number of repetitions in front of it.

Regular expression:

REGEXP: a pattern written by a class of special characters and text characters, some of which (metacharacters) do not represent the literal meaning of the characters, but represent control or wildcard functions.

Program support: grep,sed,awk,vim, less,nginx,varnish, etc.

There are two categories:

Basic regular expression: BRE

Extended regular expression: ERE

Grep-E, egrep

Regular expression engine:

Use different algorithms to check the software modules that deal with regular expressions

PCRE (Perl Compatible Regular Expressions)

Metacharacter classification: character matching, matching times, position anchoring, grouping

Man 7 regex

Character matching:

. Matching any single character defaults to greedy matching

[] matches any single character within the specified range. There is no need to escape in it.

[^] matches any single character outside the specified range

[: alnum:] letters and numbers

[: alpha:] stands for any English uppercase and lowercase characters, that is, Amurz, Amurz

[: lower:] lowercase letters

[: upper:] capital letters

[: blank:] White space characters (spaces and tabs)

[: space:] horizontal and vertical white space characters (wider than [: blank:])

[: cntrl:] non-printable control characters (backspace, deletion, alarm...)

[: digit:] Decimal number

[: xdigit:] hexadecimal number

[: graph:] printable non-white space characters

[: print:] printable characters

[: punct:] punctuation mark

Number of matches: used after the number of characters to be specified, to specify the number of times the preceding character will appear

* match the preceding characters any number of times, including 0

Greedy pattern: match qualified characters as much as possible

. * any character of any length

\? Match the character before it 0 or 1 times

\ + match the character before it at least once

\ {n\} match the previous character n times

\ {mdirection n\} matches the preceding characters at least m times and n times at most

\ {, n\} match the previous characters at most n times

\ {n,\} match the preceding characters at least n

Position anchoring: positioning where it appears

^ Line header anchoring, for the leftmost side of the pattern

Anchor at the end of the line, for the rightmost side of the pattern

^ PATTERN$ is used to match the whole row of patterns

^ $blank line

^ [[: space:]] * $blank line

< 或 \b 词首锚定，用于单词模式的左侧 \>

Or\ b suffix anchoring; used on the right side of the word pattern

\ match the whole word

Grouping:\ (\) bind one or more characters together and treat them as a whole, such as:\ (root\)\ +

The content matched by the pattern in the group parentheses is recorded by the regular expression engine in the internal variables.

These variables are named as:\ 1,\ 2,\ 3,.

\ 1 indicates the character matched by the pattern between the first left parenthesis from the left and the matching right parenthesis

Example:\ (string1\ +\ (string2\) *\)

\ 1: string1\ +\ (string2\) *

\ 2: string2

Backward reference: refers to the characters matched by the pattern in the preceding group parentheses, not the pattern itself

Or:\ |

Example: a\ | b: an or b C\ | cat: C or cat\ (C\ | c\) at:Cat or cat

Extended regular expression:

Egrep = grep-E

Egrep [OPTIONS] PATTERN [FILE...]

Extend the metacharacters of regular expressions:

Character matching:

. Any single character

[] specify a range of characters

[^] characters that are not in the specified range

Number of times match:

*: match the preceding characters any number of times

0 or 1 time

+: 1 or more times

{m}: match m times

{mdirection n}: at least m, at most n times

Position Anchor:

^: beginning of the line

$: end of Lin

\,\ b: the end

Grouping:

() backward reference:\ 1,\ 2,.

Or:

A | b: an or b C | cat: C or cat (C | c) at:Cat or cat

Exercise:

1. Display the lines in the / proc/meminfo file that begin with the size s (requirement: use two methods)

2. Display lines in the / etc/passwd file that do not end with / bin/bash

3. Display the default shell program of the user rpc

4. Find the two or three digits in / etc/passwd (as long as you can add the-o option to display only the numbers)

5. Display lines in CentOS7's / etc/grub2.cfg file that begin with at least one white space character and are followed by a non-white space character

6. Find the line in the result of the "netstat-tan" command that ends with LISTEN followed by any number of white space characters

7. Display the user names and UID of all system users on the CentOS7

8. Add users bash, testbash, basher, sh, nologin (whose shell is / sbin/nologin), and find the line with the same name as / etc/passwd user name and shell

9. Using df and grep, take out the utilization of each partition of the disk and sort it from large to small.

Grep and regular expression parameters

One: grep parameter

1furown: displays the line number

2furowo: show only the matching content

3furowq: silent mode, without any output, must use $? To determine whether the execution was successful, that is, whether the desired content was filtered.

4 rl: if the match is successful, only the file name will be printed. If it fails, the file name will not be printed. It is usually used together with-rl, grep-rl 'root' / etc.

5mai Mei A: if the match is successful, the matching line and the n lines that follow it are printed together.

6furowb: if the match is successful, the matching line and its first n lines will be printed together

7mai Mei C: if the match is successful, the matching line and its n lines before and after it will be printed out.

8MARC: if the match is successful, the number of matching lines will be printed out.

9furowE: equal to egrep, extended

10furowi: ignore case

11furowv: inverted, mismatched

12furoww: matching words

Second: regular introduction

First, build a.txt. In the process of verification

1, ^ the beginning of the line

2 at the end of the line

3,. Any single character except a newline character

Zero or more of the leading characters of 4.

5. * all words

6, any character in the [] character group

7, [^] inverts each character in the character group (does not match each character in the character group)

8, ^ [^] A line that is not the beginning of a character in a character group

9, [amurz] lowercase letters

10, [Amurz] capital letters

11, [aMuz] lowercase and uppercase letters

12, [0-9] number

13,\ end of the word

On how to use grep and regular expressions in linux to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.