Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the sed and gawk editors

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article is about how to use sed and gawk editors. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Sed editor can perform various processing operations on data express when reading data, s command can replace text, I command can insert text, a command can add text, c command can modify text, d command can delete text, y command can print text, p command can print text, = command can print line number, l command can print ASCII character The w command can output the contents to the specified file, and the r command can read the contents from the specified file.

The sed editor also supports row addressing, and most of the above commands support the flexible use of row addressing to manipulate data.

The gawk editor is a version of GNU that is migrated from Unix to Linux according to the awk tool. Although it is powerful, Linux does not install the tool by default and can be installed through the yum install gawk command. The gawk editor provides a programming language, not just editor commands.

19.1 text processing

The effect of sed and gawk is that you can automatically format, insert, modify, or delete text data in a file without entering the interactive editor.

19.1.1 sed Editor

Sed editor is referred to as stream editor (Stream Editor). The data in the data stream can be processed according to the commands, which can be entered directly from the command line or exist in the specified file.

The sed editor matches all commands with a row of data, automatically reads the next row of data, and repeats the previous operation, and the naming terminates when all the data has been read. The processed data will not affect the original file, but will be output to STDOUT.

The basic format of the sed command is sed option script file.

The options available in option are shown below:

19.1.1.1 define editor commands on the command line

By default, the sed editor applies the specified command to the STDIN so that the data can be piped directly into the sed editor for processing, as follows:

As you can see in the figure above, the statement output by echo is passed into the sed command through the | pipe. The s command is used in the sed editor to replace the contents of the first text with the second text specified between the slashes.

19.1.1.2 use multiple editor commands during command exercise

To specify multiple commands in the command line mode of the sed editor, use the sed-e command as follows:

As you can see in the figure above, after adding the-e instruction, you only need to use a semicolon to separate multiple commands. It should be noted that there cannot be a space between the semicolon and the end of the command.

If you do not want to use a semicolon, you can also use the secondary prompt of bash shell to separate the command, as follows:

In this mode, there is no need to add a semicolon at the end of the command.

19.1.1.3 read editor commands from a file

Using the sed-f command, you can read the command from the file with the following effect:

In this mode, there is no need to add a semicolon at the end of the command. It is important to note that the. sed suffix is not mandatory, just to avoid confusion between script files in the sed editor and other files.

19.1.2 gawk Editor

The gawk editor provides a class programming environment that makes it easier to modify and reorganize data in files.

The gawk editor is not installed by default in Linux, and if it does not exist in the current Linux, you need to use the yum install gawk command to install it.

Use the yum info gawk command to view the details of the editor, and the source: installed indicates that the editor is installed in the current Linux.

You can also use the whereis gawk command to see if the editor exists in the current Linux.

The gawk editor is the GNU version of the awk editor in Unix, which provides a programming language, not just editor commands.

The power of the gawk editor is that you can write scripts that read lines of text, process the data before displaying it, and create any type of output report.

19.1.2.1 gawk command format

The basic format of the gawk editor is gawk option'{program} 'file, and the editor script must be wrapped in single quotes and curly braces.

The options available in option are shown below:

19.1.2.2 read a script from the command line

By default, the gawk editor receives data from STDIN with the following effect:

When the gawk command receives the output of the echo command passed in through the pipe, the Hello World statement is printed on the console.

If you execute the gawk command directly from the command line, the command waits for user input, with the following effect:

As you can see from the figure above, after executing the gawk command for the first time, manually type 1 and enter, the console prints the Hello World statement, type 2 again and enter, the console prints Hello World again, and so on. The gawk editor listens for user input all the time as long as you don't exit manually.

At the end of the first gawk command, you can clearly see that you forced the exit of the gawk editor using Ctrl + C. In fact, the editor itself supports the use of Ctrl + D to exit listening, you can see that the second execution of the gawk command at the end, the editor does not show Ctrl + C key marks after exit, because here the correct use of Ctrl + D to exit the gawk editor.

19.1.2.3 using data field variables

When processing file data, the gawk editor automatically assigns a variable to each data separated by a field delimiter, as follows:

$0 represents the entire line of text

$1 represents the first data field

$2 represents the second data field

and so on

The default field delimiter for gawk is any white space character, such as a space or tab, with the following effect

As you can see from the figure above, there is a space in each row of data, through which the data of each row is divided into two parts. The gawk editor uses $1 to successfully get the first part of each row and output it.

You can use gawk-F to modify the field delimiter, as follows:

The above figure uses the gawk-F: command to replace the field delimiter with a colon, and then outputs the first field of each line in the passwd file. Because there is too much output, the output is piped into the tail-n 5 command, and only five lines of data are output.

19.1.2.4 using multiple commands in a script

The gawk editor allows multiple commands to be combined into a complete editor, similar to the sed editor, separated by semicolons for multiple commands, the effect is as follows:

As you can see from the figure above, the gawk editor changes the third field to not is, and then uses the print $0 command to output the entire line of data.

Similarly, using the secondary prompt to write multiple commands is also supported, with the following effect:

19.1.2.5 read a script from a file

The gawk editor allows scripts to be stored in a file, with the following effect:

As you can see from the above figure, it is very convenient to write multiple commands in a script, and only curly braces are needed to wrap the script commands, instead of using single quotation marks

19.1.2.6 run the script before processing the data

The gawk editor can control when script commands are run. By default, script commands are automatically executed once a line of text is read. However, you can use the BEGIN keyword to force gawk to execute a specified script before reading the data. The results are as follows:

As you can see from the figure above, by default, the gawk command will output Hello World after listening for user input. But when the BEGIN keyword is used, the gawk command outputs Hello World directly and no longer waits for user input.

In this way, you can prepare a general display header information for the output, with the following effect:

19.1.2.7 run the script after processing the data

Use the END keyword to force gawk to execute a specified script after reading data, as follows:

19.1.2.8 specify the field delimiter through the FS variable

If you are writing a script in a file, you can use the special variable FS to specify the field delimiter, as shown below:

In the image above, the output of the BEGIN keyword section is not shown. Do you know why?

Because the output is finally piped into the tail-n 5 command, the effect of this command is to output the last five lines of data, so the initial output can not be displayed, this is not the script BUG.

19.2 sed Editor Foundation

Introduces some commonly used sed commands.

19.2.1 more alternatives

There are some options to make the s command more flexible when replacing text.

19.2.1.1 replacement tag

By default, when you execute the replace command, only the first match that appears in each line is replaced, and if there is more than one match in each line, the subsequent matches are ignored, as follows:

As you can see in the figure above, there are two test in each line of the target text, and the sed command wants to replace test with trail, but after the command is executed, only the first test on each line is replaced, and the subsequent test remains unchanged.

If you add a replacement tag, there is a way to solve the above situation. First, take a look at the four ways to replace tags:

In the above way, the second tag is also called global substitution, and the effect is as follows:

As you can see from the figure above, all test in the specified file has been replaced with trail.

The first way is to specify the match to be replaced by a digital tag, and the effect is as follows:

The third way is to print out the contents of the replaced line, and the effect is as follows:

As you can see in the figure above, the replacement target after using the sed command is the second field, which has only one match for the second row of data. When the command is used for the first time, all scanned lines are output. When the p tag is replaced by a bit in the replace command the second line of the replaced content is output again after all the scanned lines are outputted, which is the effect of the p tag.

So in the use of this command will generally carry the sed-n command, the-n option can mask the default output of the sed command, combined with the effect of the p tag, you can only display the lines to be replaced, such as the effect of the third time in the figure above.

The fourth way is to output the lines of the replaced content to the specified file, and the effect is as follows:

As you can see in the above figure, after the command execution is complete, viewing the contents of the result.txt is the second line of the content being replaced.

19.2.1.2 replacement character

When replacing content in the sed command, if some of the content involves sensitive characters, such as the forward slash (/) which is itself used as the delimiter of the replacement operation, it is very troublesome to operate, and the effect is as follows:

At this time, you can actually specify other characters as the delimiter of the replacement operation, such as the exclamation point (!). The effect is as follows:

19.2.2 use address

By default, the sed command acts on all rows of the specified data. If you want the command to work on a specific line or lines, you need to use Line Addressing.

There are two ways of line addressing in the sed editor:

* specify line intervals in numeric form

* filter specified lines through text mode

19.2.2.1 Line addressing in digital form

The sed editor numbers the first target text as 1, the second line as 2, and so on. When using digital row addressing, there are three options:

1. 2s, which means that a single affects only the second row

2. 2Jing 3s, indicating the influence of the second to third lines

3. 2 minutes, indicating that it is affected from the second line to the last line, and the dollar sign ($) indicates the last line

The effect of the first form is as follows:

As you can see in the figure above, only the second row of data has changed.

The second form has the following effect:

As you can see in the figure above, the data in the second and third rows have changed.

The third form has the following effect:

As you can see in the figure above, the data has changed from the second row to the last row.

19.2.2.2 use text mode filter

The sed editor allows content substitution for lines with specified text, with the following effect:

As you can see in the figure above, the sed command first finds the line where asing1elife exists in the target text, and then replaces the My of that line with He, while other lines that do not have asing1elife are not affected.

This pattern will be more powerful if combined with regular expressions.

19.2.2.3 Command combination

If you want to execute multiple commands on a single line, use curly braces to wrap multiple commands in multiline mode, as follows:

As you can see in the figure above, the sed command first specifies that the number of rows affected is the second line, and then replaces it twice in the second line.

19.2.3 Delete a row

Use the d command to delete the specified line to which the addressing mode matches, and the addressing mode of the d command is consistent with the rules of the s command.

This command requires the following two points:

1. Only the stream output is affected, not the original file

two。 Addressing mode must be added, otherwise all data output by the stream will be deleted

The effect of specifying a single line is as follows:

The effect of specifying multiple lines is as follows:

The effect of specifying the start line to the last line is as follows:

The effect of the specified text is as follows:

You can also specify the range as text, with the following effect:

However, this mode should be used with caution, because for the sed editor, the first text match turns on line deletion, and the second text match turns off line deletion. So if there is no match to the second text, the subsequent content will be deleted because the line deletion feature cannot be turned off. The effect is as follows:

Or if the data in the target file is duplicated, it will cause the sed editor to match the first text again and turn on the line deletion feature. The effect is as follows:

19.2.4 insert and attach text

The I command of the sed editor adds a row of new data before the specified row, and the a command adds a row of new data after the specified row, with the following effect:

It should be noted that the insert and attach commands use a backslash (\), while the replace command uses a forward slash (/).

If you want to insert or append multiple lines of text at the same time, you need to add a backslash (\) at the end of each line of text when using multiline mode, as follows:

19.2.5 modify Lin

The c command of the sed editor modifies all the data contents of the specified row, as follows:

It should be noted that the modify command also uses a backslash (\).

19.2.6 conversion command

The y command of the sed editor can handle a single character in the format sed [address] y/inchars/outchars/, which maps each character in inchars to each character in outchars and replaces them separately, with the following effect:

As you can see in the figure above, the conversion command defaults to the global effect, and you don't need to use the g option to turn on the global substitution effect like the replace command. Unfortunately, whether the conversion effect of the conversion command is global is not optional, the default is global, and it can only be global conversion.

It should be noted that the length of inchars and outchars must be the same, otherwise an error will be reported. The effect is as follows:

19.2.7 Review Printing

In addition to the p option in the replace command that can be used to print the replaced line, there are three commands that print information about the data stream:

The p command is used to print lines of text

= command is used to print line numbers

The l command is used to list rows, which is lowercase L.

19.2.7.1 print Lin

The p command can print the specified line content, but it is recommended to use it in conjunction with the sed-n command. The effect is as follows:

As you can see in the figure above, the first time you use the p command, due to the default output effect of the sed editor, the complete stream data is output first, and then the line content that the p command matches is output. The second time you mask the default output with the sed-n command, you can see only the output of the p command.

The p command also supports line addressing, with the following effect:

19.2.7.2 print line number

By default, the sed editor adds a number to each line of the target text, which can be output using the = command. The effect is as follows:

19.2.7.3 list Lin

The function of the l command is to print out ASCII characters that are otherwise unprintable in the data stream. The effect is as follows:

19.2.8 use sed to process file 19.2.8.1 write to file

The w command writes the specified line of the target file to the specified file with the following effect:

19.2.8.2 read data from a file

The r command allows the contents of the specified file to be inserted into the specified location of the target file, with the following effect:

Thank you for reading! This is the end of this article on "how to use sed and gawk editors". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report