What is the basic use of sed and awk 07/11 Update SLTechnology News&Howtos

What is the basic use of sed and awk

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

In this issue, the editor will bring you about the basic use of sed&awk. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Basic usage of sed&awk

Brief introduction of sed tool

After understanding some of the basic applications of formal representation, what about it? Ha ha ~ two things to play with, that is sed and awk!

These two guys are quite useful! For example, the Mini Program of the logfile.sh analysis login files written by Brother Bird, and the vast majority of the analysis keyword access, statistics, and so on, are done with these two precious eggs to help me! So, do you want to play?! ^ _ ^

Let's talk about sed first. Basically, sed can analyze Standard Input (STDIN) data.

The data is then processed and then output to a tool in standrad out (STDOUT).

What about dealing with it? You can replace, delete, add, retrieve specific rows, and so on! It's good, isn't it? let's first understand the use of sed, and then talk about its use!

[root@linux ~] # sed [- nefri] [Action]

Parameters:

-n: use silent mode. In the general use of sed, everything comes from STDIN

The data is usually listed on the screen. But if you add the-n parameter, you can only pass through

The line (or action) specially handled by sed will be listed.

-e: edit sed actions directly in instruction line mode

-f: directly write the actions of sed in a file, and-f filename can execute the

Sed action

-r: the action of sed supports the syntax of extended regular representations. (the preset is the basic formal representation syntax)

-I: directly modify the contents of the read file, rather than output by the screen.

Action description: [N1 [, N2]] function

N1, N2: it doesn't necessarily exist. It generally stands for "the number of rows selected for an action." for example, if my action

If it needs to be done between 10 and 20 lines, then "10jue 20 [Action behavior]"

Function has the following drumming:

A: added, a can be followed by strings, and these strings will appear on the new line (the current next line) ~

C: instead, c can be followed by strings, which can replace the lines between N1 and N2!

D: delete, because it is deleted, so there is usually no knock after d.

I: insert, I can be followed by strings, and these strings will appear on the new line (the current previous line)

P: print, that is, the selected data is printed out. Usually p works with the parameter sed-n ~

S: replace, you can directly carry out the work of replacement! Usually the action of this s can be matched.

Formal representation! For example, 1Magazine 20s Universe, OldCompact newscop g is it!

Example:

Example 1: list the contents of / etc/passwd, and I need to print the line number, at the same time, please delete line 2-5!

[root@linux ~] # nl / etc/passwd | sed '2jue 5d'

1 root:x:0:0:root:/root:/bin/bash

6 sync:x:5:0:sync:/sbin:/bin/sync

7 shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown

. (omitted later).

# see? Because 2-5 lines have been deleted by him, there are no 2-5 lines in the data displayed.

# in addition, note that sed-e was supposed to be issued, but it's okay without-e!

# at the same time, please note that the actions followed by sed must be enclosed in''two single quotation marks!

# however, if you just delete line 2, you can use nl / etc/passwd | sed '2d' to achieve this.

# as for the third to last lines, it is nl / etc/passwd | sed '3d`!

Example 2: following the above question, add the words "drink tea?" after the second line (that is, the third line)!

[root@linux ~] # nl / etc/passwd | sed'2a drink tea'

1 root:x:0:0:root:/root:/bin/bash

2 bin:x:1:1:bin:/bin:/sbin/nologin

Drink tea

3 daemon:x:2:2:daemon:/sbin:/sbin/nologin

# Hey, hey! The string added after a will appear after the second line! What if it's before the second line?

# nl / etc/passwd | sed'2i drink tea' is right!

Example 3: add two lines after the second line, such as "Drink tea or."drink beer?"

[root@linux ~] # nl / etc/passwd | sed'2a Drink tea or.\

> drink beer?'

1 root:x:0:0:root:/root:/bin/bash

2 bin:x:1:1:bin:/bin:/sbin/nologin

Drink tea or.

Drink beer?

3 daemon:x:2:2:daemon:/sbin:/sbin/nologin

The point of this example is that we can add more than one line! You can add several lines ~

# but there must be a backslash between each line to add new lines! So, in the above example,

# We can find it at the end of the first line! That's a must!

Example 4: I want to replace lines 2-5 with "No 2-5 number"?

[root@linux ~] # nl / etc/passwd | sed'2 No 5c 2-5 number'

1 root:x:0:0:root:/root:/bin/bash

No 2-5 number

6 sync:x:5:0:sync:/sbin:/bin/sync

# without 2-5 lines, hey! The data we need will appear!

Example 5: list only lines 5-7

[root@linux ~] # nl / etc/passwd | sed-n '5jue 7p'

5 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin

6 sync:x:5:0:sync:/sbin:/bin/sync

7 shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown

# Why add the parameter-n? You can issue your own sed '5jue 7p' to know! (lines 5-7 will repeat output)

# when you add the parameter-n, the output data is much worse!

Example 6: we can use ifconfig to list IP, if only the IP of eth0?

[root@linux ~] # ifconfig eth0

Eth0 Link encap:Ethernet HWaddr 00:51:FD:52:9A:CA

Inet addr:192.168.1.12 Bcast:192.168.1.255

Mask:255.255.255.0

Inet6 addr: fe80::250:fcff:fe22:9acb/64 Scope:Link

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

. (omitted below).

# actually, all we want is that inet addr:.. It's just that line, so use grep and sed to catch it.

[root@linux ~] # ifconfig eth0 | grep 'inet' | sed's / ^. * addr://g' |

> sed's plank cast.hammer racket g'

# you can execute the process of each pipeline (|) separately and you will know why!

# after removing the head and tail, we will get the IP we need, that is, 192.168.1.12.

Example 7: take out the MAN settings in the contents of the / etc/man.config file, but do not specify the content.

[root@linux ~] # cat / etc/man.config | grep 'MAN' | sed's pick-up point. pick-me-up Unip G' |

> sed'/ ^ $/ d'

# in each line, if there is a # indicates the behavior annotation, but it should be noted that sometimes

The # comment is not written in the first character, that is, after an instruction, as it looks like below:

# "shutdown-h now # this is the shutdown instruction". Note # is right after the instruction.

# that's why we use the formal representation of #. * $!

Example 8: add "# This is a test" directly to the last line of ~ / .bashrc using sed

[root@linux ~] # sed-I'$a # This is a test' ~ / .bashrc

# the-I parameter above allows your sed to modify the contents of the following files directly. Rather than output from the screen.

# as for that $a, it means that the last line is added.

Anyway, this sed is good to use! And many shell script will use the function of this instruction ~ sed

Can help the system administrator to manage the daily work! Study carefully!

Brief introduction of awk tool

Rather than sed, which often acts on an entire row, awk tends to split a row into several "fields". Therefore, awk

It is quite suitable for dealing with small data processing. The usual mode of operation of awk is as follows:

[root@linux ~] # awk 'condition Type 1 {Action 1} condition Type 2 {Action 2}.' Filename awk

You can handle subsequent files, or you can read the standard output from the previous instruction. But as I said earlier, awk

It mainly deals with "data in the fields of each row", while the delimiters of the preset "fields" are "spacebar" or "[tab] key"! For example, we use last

The data of the logger can be extracted, and the results are as follows:

[root@linux ~] # lastdmtsai pts/0 192.168.1.12 Mon Aug 22

09:40 still logged in

Root tty1 Mon Aug 15 11:38-11:39

(00:01)

Reboot system boot 2.6.11 Sun Aug 14 18:18

(7-15-15-1-1)

Dmtsai pts/0 192.168.1.12 Fri Aug 12 12:07-12:08

(00:01)

If I want to withdraw the account and the login's IP, and the account is separated from the IP by [tab], it will look like this:

[root@linux ~] # last | awk'{print $1 "\ t" $3}'

Dmtsai 192.168.1.12

Root Mon

Reboot boot

Dmtsai 192.168.1.12

Because I have to deal with no matter which line, therefore, there is no need for "conditional type" restrictions! What I want is the first column and the third column

However, the contents of the second and third lines are strange ~ this is because of the data format! So ~ use awk

Please first confirm your data, if it is continuous data, please do not have spaces or [tab], otherwise, it will be like this example, misjudgment will occur!

In addition, as you will know from the above example, every field on each line has a variable name, which is $1, $2. The name of the variable, as shown in the example above

Dmtsai is $1, because he is the first column! As for 192.168.1.12 is the third column, so he is $3

La! After this analogy ~ hehe! There is another variable! That's $0, and $0 means "a whole list of data". In the above example, the $0 in the first line.

It stands for "dmtsai pts/0....." "that line! From this, we can see that in the above four lines, the whole awk processing flow is:

Read the first line and fill in $0, $1, $2. Among the equal variables

Judge whether the following "action" is needed according to the limitation of "condition type".

Finish all the actions and condition types

If there are subsequent "rows" of data, repeat the steps 1-3 above until all the data has been read. After these steps, you will know that awk

Is the unit of one-time processing by behavior, while the smallest unit of processing is field. Okay, so how does awk know how many lines of data I have? How many columns are there? This requires awk.

With the help of built-in variables ~

Variable name

Representative meaning

Total number of fields per row ($0)

At present, awk is dealing with "which row" data.

The current delimiter is the space bar by default

Let's continue to illustrate with the above example, if I want to list the account number of each line, and list the number of rows currently processed, and indicate how many fields there are in the row, I can do this (note that awk

All subsequent actions are enclosed in', so if you want to print the content in print, remember that the text part of the invariant contains the previous section

Printf

All of the formats mentioned need to be defined in double quotation marks.

[root@linux ~] # last | awk'{print $1 "\ t lines:" NR "\ t columes:"

NF}'

Dmtsai lines: 1 columes: 10

Root lines: 2 columes: 9

Reboot lines: 3 columes: 9

Dmtsai lines: 4 columes: 10

So you can understand the difference between NR and NF? All right, let's talk about the so-called "type of condition".

Logical operation characters of awk

Since there is a need to use the category of "condition", it naturally requires some logical operations, such as the following:

Operation unit

Representative meaning

Greater than

Less than

> =

Greater than or equal to

Less than or equal to

= =

Equal to

! =

Not equal to

It is worth noting that the = = sign, because in the "logical operation", that is, the so-called judgment of greater than, less than, equal to, etc., we are accustomed to = =

To represent it, and if it is directly given a value, such as when a variable is set, it simply uses =. All right, let's actually use logical judgment. For example, in

/ etc/passwd is separated by a colon ":", so suppose I want to look up the data in the third column that is less than 10, and only list the account number and the third column.

Then you can do this:

[root@linux ~] # cat / etc/passwd |\

> awk'{FS= ":"} $3

Root:x:0:0:root:/root:/bin/bash

Bin 1

Daemon 2

. (omitted below).

Interesting! But why isn't the first line displayed correctly? This is because when we read the first line, those variables $1, $2.

The preset is still separated by the spacebar, so although we define FS= ":", it can only take effect after the second line. So what should I do? We can set awk in advance.

The variable! Use the key word BEGIN! Do this:

[root@linux ~] # cat / etc/passwd |\

> awk 'BEGIN {FS= ":"} $3. (omitted below).

Isn't that interesting! In addition to BEGIN, we have END! In addition, what if you want to use awk for "computing functions"? Take the following example.

Suppose I have a salary data sheet that looks like this:

Name 1st 2nd 3th

VBird 23000 24000 25000

DMTsai 21000 20000 23000

Bird2 43000 42000 41000

How can you help me calculate the total amount of each person? And I also want to format the output! You can save the above data into a file named pay.txt, then:

[root@linux ~] # cat pay.txt |\

> awk 'NR==1 {printf "s s s s s\ n", $1, Total, 2, 3, 4

}

NR > = 2 {total = $2 + $3 + $4

Printf "s d d d .2f\ n", $1, $2, $3, $4, total}'

Name 1st 2nd 3th Total

VBird 23000 24000 25000 72000.00

DMTsai 21000 20000 23000 64000.00

Bird2 43000 42000 41000 126000.00

There are several important things that should be explained in the above example:

For all actions, that is, actions within {}, if multiple instructions are needed, use the semicolon ";" interval, or directly use [Enter]

Press the key to separate each instruction, such as the action followed by NR > = 2 above, using total =. That instruction specifies the sum, followed by a

Printf to format the output!

In logical operations, if it is "equal", be sure to use two equal signs "="!

When formatting the output, be sure to add\ n to the printf format in order to branch!

Unlike bash shell variables, in awk, variables can be used directly without the $symbol.

Using awk, you can help us deal with a lot of daily work. It's really easy to use. in addition, the output format of awk is often used as

Printf

To assist, so it's better for you to be a little familiar with printf! In addition, awk's action {} also supports if (condition)!

For example, the above instruction can be revised to look like this

[root@linux ~] # cat pay.txt |\

> awk'{if (NR==1) printf "s s s s

S\ n ", $1, 2, 3, 4," Total "}

NR > = 2 {total = $2 + $3 + $4

Printf "s d d d .2f\ n", $1, $2, $3, $4, total}'

You can carefully compare the differences between the above two inputs ~ learn about the two grammars! Personally, I am more inclined to use the first grammar, because it will be more unified! ^ _ ^

The above is the basic use of sed&awk shared by the editor. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.