In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
Awk command
Awk is a language that processes text files and is a powerful public tool for text analysis.
The way awk handles text and data: read the text line by line, look for lines that match a particular pattern, and then manipulate it.
Specific fields for matching lines in the output file
It is very powerful, so it has a lot of uses. Here I mainly focus on the following scenarios:
Read the text line by line, match specific lines according to the rules, slice each line with a space as the default delimiter, and output a specific slice (the cut part can be analyzed for various processing, and here you want to output a segment of it):
$cat / etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 $awk'/ local/ {print $1}'/ etc/hosts127.0.0.1::1 $
This method is suitable for monitoring custom key of zabbix. For example, extract the memory usage from the free command:
$free total used free shared buff/cache availableMem: 1855432 320688 1238808 10612 295936 1495432Swap: 2093052 0 2093052$ free | awk'/ ^ Mem:/ {print $3} '320688
Grep command
For the same effect, you can filter out the desired rows through the grep command, and then use the cut command to cut the columns.
But if you use awk, you can do it in one step.
Built-in variable
The built-in variables are listed first, and some of them will be used later.
Awk built-in variables:
ARGC: command line parameters ARGV: command line parameter arrangement ENVIRON: support queue system environment variables using FILENAME: awk browsing file name FNR: number of records browsed files FS: set input field delimiter, equivalent to command line-F option IGNORECASE: if this variable is set to 1 Then the regular expression ignores case NF: number of fields browsing records NR: number of records read OFS: output field separator ORS: output record separator RS: control record separator $0: variable refers to the whole record $1: represents the first field of the current line, $2 represents the second field of the current line,. And so on $NF: since NF is the total number of columns, $NF is the value of the last column $NF-1: this is the penultimate column, and so on
Some of the above variables are used directly. For example, $1 recording NF, which will be used in the following example, is also easier to understand.
Others are used to change the awk behavior, which requires variables to be set, which are assigned values to variables, and can be done in a variety of ways.
FS, for example, is used to specify the delimiter. The default delimiter is a blank character, but it can be specified. You need to define the value of FS yourself. However, the delimiter also provides a-F option to define it. So it can also be set in the command line options.
However, if some other variables need to be specified, but no other methods are provided, they can only be implemented by assigning values to the variables.
Delimiters and methods of assigning values to variables are expanded later, assigning values to variables that refer to the contents of custom variables.
Separator
The default awk is separated by a blank character. Use the-F option to customize the delimiter:
$grep-e "^ root" / etc/passwdroot:x:0:0:root:/root:/bin/bash$ awk-F:'/ ^ root/ {print $1 Tokyo NF}'/ etc/passwdroot / bin/bash$
The delimiter is specified here as a colon.
Multiple delimiter
The default is also the case of multi-delimiters, where spaces, tabs, and so on are recognized. If you want to specify multiple delimiters, you enclose all the delimiters that need to be identified in square brackets:
$echo "a-b_c=d-E_F=G" | awk-F [- _ =]'{print $1, 2, 3, 4, 5, 6, 7}'a b c d E F G$
Filter consecutive delimiters
The-F option also supports regular expressions, and square brackets mean a set of regular expression characters. But if you encounter a continuous delimiter at this time, there will be a problem. The following uses commas and spaces as delimiters and appears in succession each time:
$echo "a pencil b c" | awk-F'[,]'{print $1 "-" $2 "-" $3}'a Murray b $
Match one or more times in a regular expression, and after using the plus sign, you can:
$echo "a meme b c" | awk-F'[,] +'{print $1 "-" $2 "-" $3}'a Murray b c $
Special character separator
These should be the special characters: $, ^, *, (,), [,],?,., |
There is no problem with being a separator alone:
$echo'1a, 1b, 1c' | awk-Foundry, 1c ${print $1 "-" $2 "-" $3} "1a, 1b, 1c $
If you specify multiple characters as a whole as a separator, there will be a problem and need to be escaped. For example, here you want to use $1 as the delimiter:
$echo '1afang 1baked 1c' | awk-Fleming 1' {print $1 "-" $2 "-" $3}' 1athermal 1baked 1c Mustang $echo '1athermal 1baked 1c' | awk-F'\ $1' {print $1 "-" $2 "-" $3}' 1amurbwc $
Let's have a combination of several special characters:
$echo 'a$ | b$ | c' | awk-F'\ $\\ |'{print $1 "-" $2 "-" $3} 'aMurbmaec $
Default delimiter
The default is the white space character as the delimiter, and can recognize consecutive white space characters. The default delimiter is the following regular expression:
FS= "[: space:] +]"
Looking at the built-in variables above, the FS and-F options are equivalent.
Formatted output printf
In addition to using print, you can also use printf to do formatted output. Here is an example of printf formatted output, if necessary, refer to the function of printf in C language.
Print is usually used to output:
$awk-F:'{print "filename:" FILENAME ", linenumber:" NR ", columns:" NF ", linecontent:" $0}'/ etc/passwd filename:/etc/passwd,linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bashfilename:/etc/passwd,linenumber:2,columns:7,linecontent:bin:x:1:1:bin:/bin:/sbin/nologinfilename:/etc/passwd,linenumber:3,columns:7 Linecontent:daemon:x:2:2:daemon:/sbin:/sbin/nologin
Compare the effect of formatting the output with printf:
$awk-F:'{printf ("filename:s, linenumber:%3s,column:%3s,content:%3f\ n", FILENAME,NR,NF,$0)}'/ etc/passwdfilename:/etc/passwd, linenumber: 1 etc/passwdfilename:/etc/passwd, linenumber: 1 Magistral column: 7 magical contently0.000000filenameRelexetcpasswd, linenumber: 2Letcpasswd, linenumber: 2Letcpasswd, linenumber: 3Magnum column: 7contentRank 0.000000BEGIN and END module
Typically, awk executes a script block of code once for each input line.
Sometimes, initialization code needs to be executed before awk starts processing the text in the input file. This requires defining a BEGIN block.
In addition, there is an END block that performs the final calculation or prints the summary information that should appear at the end of the output stream.
Define built-in variables in a BEGIN block
Here are two built-in variables defined in the BEGIN block:
$echo "a meme b c" | awk 'BEGIN {FS= "[,] +"; OFS= "-"} {print $1 minus 2 meme 3}' a Murray b c $
FS is the delimiter and OFS is the output field delimiter. In the previous example, this effect can be achieved without a BEGIN block.
What is modified here is a built-in variable, but the method is for variables, including custom variables. Refer to the next chapter, "awk Custom variables".
Here we mainly choose BEGIN block as an example. END block can realize the function of calculating statistical output, which is not needed for the time being, so skip it.
Awk custom variable
In addition to built-in variables, custom variables can also be defined and used. This section is very useful for flexible configurations, and if you write it yourself, you will encounter some pitfalls.
Defined at the end
Where the variable assignment occurs after the execution of the BEGIN block
Write it directly at the back:
$echo | awk'{print key1,key2} 'key1=v1 key2=V2v1 V2 $
This usage does not recognize variables in BEGIN blocks. BEGIN blocks are executed before these variables are defined. But there are other ways to use it.
In addition, pipes are used as standard input here. If the input is from a file, the file path is at the end of the write.
Define in BEGIN block
Here, the variable amount assignment is when the BEGIN block is executed.
You can assign values to built-in variables in the BEGIN block, as well as to custom variables
$echo | awk 'BEGIN {key1= "v1"; key2= "value2"; OFS= "_"} {print key1,key2}' v1_value2 $use the-v option
Here the variable is assigned before the BEGIN block is executed
This method occurs before the BEGIN block executes:
$echo | awk-v key1=V1-v key2=value2'{print key1,key2}'V1 value2 $
If it is multiple variables, use-v multiple times.
If you combine the above two methods:
$echo | awk-v key1=v1-v key2=v2 'BEGIN {print "BEGIN:" key1,key2} {print "ACTION:" key1,key2}' key1=VALUE1 key2=VALUE2 BEGIN: v1 v2ACTION: VALUE1 VALUE2 $
First-v is assigned, and then the BEGIN block is executed. This is followed by the final variable assignment, replacing the value if there is one with the same name, and then line by line. What is printed is the value that was changed later.
Print environment variabl
The best way is in the next section. The method here is also feasible, but the readability is not good.
This paragraph is written to understand the process of parsing command parameters, as well as the handling of some special cases.
To print environment variables directly, it looks like this:
$echo | awk'{print "'$PATH"'}'/ usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin$
The display here is not obvious, both sides are a pair of double quotation marks and a single quotation mark, look at the picture
Explain and explain
Let's take a simpler environment variable as an example:
$echo $USERroot$
There are no spaces for special characters, so you can remove the innermost pair of double quotes:
$echo | awk'{print "'$USER'"} 'root$
There are two pairs of single quotation marks in pairs, so it is divided into two parts: awk'{print and'"}'. Awk works on commands within the two single quotes.
The rest is $USER, which was first replaced by shell.
After the variable itself is processed by shell, if there is a space or something, it will be considered not a part. Here, we enclose the value of the environment variable in double quotation marks and take the value as a field as a whole.
My understanding
The outermost quotation marks are used to define the boundaries of characters, but as long as they are continuous, they are considered by the system to be a string (a field). You can enclose multiple strings with multiple pairs of quotation marks, but do not have delimiters between each pair of quotation marks. In this way, the final parsing is given to the command to process a whole string (a field).
The following is a demonstration with the echo command:
$echo 'abc''def'abcdef$ echo' abc'$USER'def'abcrootdef$ echo 'abc'$USER'def'abcrootdef$
Plus the for loop, demonstrate again:
$HELLO='Hello World!'$for i in $HELLO; do echo $iBFORHelloWorldBEFORHelloWorldwide after $for i in 'BEFOR' "$HELLO"' AFTER'; do echo $iStrat "BEFORHelloWorldwide
The outermost quotation marks only define the boundary, using multiple pairs of quotation marks but all the contents are connected, and are also considered to be a domain.
Although there are multiple pairs of quotation marks, everything is concatenated, there is no delimiter, and it is a field that is finally handed over to the command.
The advantage of this is that you use single quotation marks, but put the parts that need to be parsed by shell outside the single quotation marks, so that shell can still parse normally.
In order to ensure that the environment variable is still a field after parsing, you need to enclose it in double quotes.
Then there are the double quotation marks in awk followed by print. Quotation marks in awk do not define boundaries but distinguish between variables and strings. If there are no double quotation marks, it means that the content is a variable. Use double quotation marks to indicate that the content is a string and print it directly.
Other writing methods
The following two ways of writing can also achieve the same effect, to help you understand:
$echo | awk "{print\" $PATH\ "}" / usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin$ echo} awk\ {print\ "" $PATH "\"\}} awk {print "/ usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin"} $
The awk command should be enclosed in single quotation marks as much as possible to prevent shell from interpreting its contents. The first way is the best.
Reference a variable defined by the command line
Of the first three methods, two complete the variable definition outside the quotation marks so that it does not interfere with shell:
$echo | awk'{print path} 'path= "$PATH" / usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin$ echo | awk-v path= "$PATH"' {print path}'/ usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin$
First, the environment variable is assigned to the custom variable on the command line, which is outside the quotation marks. Then use custom variables directly in quotation marks.
If you want to do this in the BEGIN block, refer to the practice in the section above.
Awk advance
Get the current memory usage value from the free command as well:
$free | awk'/ ^ Mem:/ {print $3} '335840 $
Regular matching is used here. But awk also has some other grammars that can match more accurately.
Conditional restriction
Limit the first field value to match:
$free | awk'$1 = "Mem:" {print $3} '335744 $
Limit the number of rows of data to be used:
$free | awk'NR = = 2 {print $3} '335796$ conditional statement
Awk also provides conditional statements such as if, else, while, etc., but it doesn't seem to be that deep. Take an example of if.
Which line is also restricted, which is determined by the if statement:
$free | awk'{if (NR = = 2) print $3} '335740$ regular match
~ is the operator that matches the regular expression. In addition, ~! Is an operator that does not match regular expressions.
Match the first field:
$free | awk'$1 ~ / Mem/ {print $3} '335844 $
Another built-in variable for regularities is IGNORECASE. If set to 1, case can be ignored:
$free | awk'$1 ~ "mem" {print $3} 'IGNORECASE=1335708 $
The method of assigning a value to a variable has been mentioned before, and there are several ways.
Print the 99 multiplication table
This very high-end look is posted at the end:
$seq 9 | sed 'Hutterg' | awk-v RS=''' {for (iSig1sidi / {print $$2}')
Regular expression suffix anchoring
When debugging mysql, I encountered some problems. Regular expression matching is not accurate enough and has multiple values:
$mysql-e'SHOW GLOBAL STATUS' | awk'/ ^ Com_select/ {print $0} 'Com_select 67679$ mysql-e' SHOW GLOBAL STATUS' | awk'/ ^ Com_update/ {print $0} 'Com_update 1098Com_update_multi 0$ mysql-e'SHOW GLOBAL STATUS' | awk' / ^ Com_delete/ {print $0} 'Com_delete 678Com_delete_multi 0$ mysql-e' SHOW GLOBAL STATUS' | awk '/ ^ Com_insert/ {print $0}' Com_insert 38494Com_insert_select 0 $
Here is to add suffix anchoring:
$mysql-e'SHOW GLOBAL STATUS' | awk'/ ^ Com_delete\ > / {print $0} 'Com_delete 708 $
The suffix anchor is\ > and the prefix is\ by the way.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.