Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the grep,sed,awk command of Linux

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the Linux grep,sed,awk command how to use the relevant knowledge, the content is detailed and easy to understand, the operation is simple and fast, has a certain reference value, I believe that we read this Linux grep,sed,awk command how to use the article will have a harvest, let's take a look.

In the Linux system, grep,sed,awk is called the three Musketeers. When you master these three tools, you can greatly improve the development efficiency. Grep,sed,awk is based on regular expressions.

Regular expressions: REGular EXPression, REGEXP metacharacters:.: match any single character []: match any single character within the specified range [^]: match any single character collection outside the specified range: [: digit:], [: lower:], [: upper:], [: punct:], [: space:], [: alpha:] [: alnum:] Note: the character set should use [] to include the number of matches (greedy mode): *: match any number of characters in front of it: a, b, ab, aab, acb, adb, amnb aqb *: any character of any length\?: matches the character in front of it at least once or 0 times\ +: matches the character in front of it at least m times, at most n times\ {1,\}\ {0m3\} Note: at least 0 times, must be displayed to write out. Position anchor: ^: anchor the beginning of the line, anything after this character must appear at the beginning of the line $: anchor the end of the line, anything before this character must appear at the end of the line ^ $: blank line\ > or\ b: anchor the suffix Any character before it must be grouped as the end of the word:\ (\)\ (ab\) * backward reference\ 1: quote everything contained in the first left parenthesis and the corresponding right parenthesis\ 2:\ 3: you can see that many symbols need to be escaped during the use of standard regular expressions, which brings some inconvenience in the work. So the extended regular expression appears.

Second, extend the regular expression 1. Character matching:. [abc]: contains any character of abc [^ abc]: does not contain any character of abc 2. Times match (no need to escape): *:?: +: match the character in front of it at least once {mmenagne} 3. Position Anchor: ^ $\ > 4. Grouping (no need to escape): (): grouping\ 1,\ 2,\ 3,. 5. Or |: or C | cat: C or cat (representing the whole part) can see that a lot of escape symbols can be omitted by using extended regular expressions, which greatly improves the readability of the code, especially when writing sed statements. It is recommended to give priority to extended regular expressions.

3. Grep command family 3.1. Grep related commands grep command family consists of three subcommands: grep, egrep and fgrep, which are suitable for different scenarios. The details are as follows: the command describes the grep native grep command, using "standard regular expressions" as the matching criteria. The grep command of the egrep extension, equivalent to $(grep-E), uses "extended regular expressions" as the matching criteria. The simplified version of the grep command of fgrep does not support regular expressions, but the search speed is fast and the utilization rate of system resources is low.

3.2. Use the method syntax grep [options] PATTERN [FILE...] Options part-I: ignore case-color: highlight the string on the match-v: show lines that are not matched by the pattern-o: show only the string matched by the pattern-E: use the extended regular expression PATTERN part to give the matching template as a string, you can use a normal string as well as a regular expression (standard & extension). The FILE section needs to find the contents of the file.

4. Sed order 4.1. Overview sed full name is Stream EDitor sed is a stream editor, line editor

Basic syntax sed [option] 'script' [input file]... Option part-n: do not output the contents of the mode space to stdout-e: you can specify multiple script scripts in the sed command, multipoint editing function-f: enter the sed script, the script contains the editing command-r: support for the use of extended regular-I: edit the source file directly

Script partial address delimiting editing command (similar to the vim command) 1) Null address: full-text editing 2) single address:   #: specify a line to edit a specific line   / pattern/: specify the line to which the pattern matches 3) address range:   #, #   #, + #   #, / pattern/   / pattern1/,/pattern2/ 4) step address:   1 address 2: start with 1 Then forward 2 lines down to match   2 lines 2: all even lines 5) Editing command:   d: delete the whole line, d put at the end of   p: display the contents of the pattern space, put at the end of   a: add text after the matching lines, use\ nto support multiple line appends. A put after the demarcation   I: put the text before it. For example: sed'3i hello' xxx   c: replace the text specified by the behavior. For example: sed'3c text' xxx replaces the third line with text. Sed-I'/ xyz/c helloworld' num.txt   w: saves matching content in the pattern space to the specified location. For example: sed-n'/ [#] / w / tmp/demo' / etc/fstab saves lines in / etc/fstab that do not begin with # to / tmp/demo.   r: read the contents of the specified file and add it to the line to which the current file is matched for file merging.  ! The conditions are reversed. Usage: address demarcation! Edit commands.   sswap: conditional replacement. Replace tag remarks: G (global replace), p (show lines that have been replaced successfully)

Replacement example: find the directory echo "/ var/log/messages" according to the input | sed's pocket / var/log/messages

4.3.sed Advanced usage pattern Space and retention Space

In the pattern space, complete the matching operation. When there is no match, the content of the text line will output stdout; by default. When it matches the line above, an editing command will be executed and the result will be output to stdout. Holding space can be understood as a temporary storage area, which is only used to perform additional actions.

Parameter h: overwrite the content in the pattern space to the hold space; H: append the content in the pattern space to the hold space; g: overwrite the content in the hold space to the pattern space; G: append the content in the hold space to the pattern space; x: interchange the content in the pattern space with the content in the hold space N: overwrites the next row of the matched row (change direction) to the pattern space; N: appends the next row of the matched row (change direction) to the pattern space; d: deletes the row in the pattern space; D: deletes all rows in the multiline pattern space

3. For example, sed-n'nscape p' FILE: show even lines; sed'1: show the contents of the file in reverse order; sed'$! d 'FILE: take out the last line; sed'\ $! Nittactic characters D'FILE: check out the last two lines; sed'/ ^ $/ dash G'FILE: delete all the original blank lines, then add a blank line after all non-blank lines Sed'ncontrol 'FILE: display odd lines; sed 'G' FILE: add a blank line after each original line; for example: extract a string

/ bin/bash info= "hellozimskyshenzhen" echo $info | sed 's/hello\ (\ w\ +\) shenzhen/\ 1Universe remark:

\ d is not supported in sed. If you want to use numbers, use [0-9], but support\ w.

The () in sed is escaped, the + is escaped, and the less than sign is escaped.

For example: determine whether there is a string in the specified format

#! / bin/bash # determines whether the input is an integer if [- n "$(echo $1 | sed-n'/ ^ [0-9]\ + $/ p')]; then echo 'yes' else echo' no' fi 5, awk command 5.1. An overview of awk awk is the acronym of the three authors who invented the tool, and awk is a report generator that is mainly used to format output. Format the text exporter.

5.2. Basic usage 1. The syntax gawk [option] 'program' FILE where program: PATTERN {ACTION STATEMENTS} {action instruction} can be understood as commands, the most commonly used are print and printf

2. The awk process of reading the document reads the document by line and splits it into small parts according to the input delimiter (using built-in variables to represent 1 for processing. 0 indicates that the entire row is displayed.

3. Option option-F: specify the delimiter of the input field;-v: to implement the custom variable var=value

4. PATTERN (used to delimit)   null: indicates that every line of the processing file   / pattern/: uses regular matching to process the line  ! / pattern/: above the inverse   relation expression: the result is true or false, and the false is not processed. The non-0 non-empty string is true and the rest is false.   line delimiting: the format of giving numbers directly is not supported. }). See examples.   BEGIN/END mode: BEGIN {} represents a program that is executed only once before starting to process the text in the file, such as printing a header. END {} means that it is executed once after text processing is complete, such as summarizing data.

For example: awk-F:'$NF== "/ bin/bash" {print $1 passwd awk}'/ etc/passwd awk-F:'$NF! "/ bash/$" {print $1 passwd awk-F: $3 awk-F;'(NR > = 2&&NR awk-F:'{printf "%-15s\ n", $1 $2}'/ etc/passwd5. Variable

Built-in variables (do not add:: input field delimiters when referencing variables, default white space characters. Use the specified. Output field delimiter Use the specified. Newline character when entering: newline character when outputting: the number of fields per line. Add NF to indicate the last column. The number of lines in the NR:number of record file, printed out is the print line number FNR: the number of lines in multiple files is counted FILENAME: the file name of the current file ARGC: the number of parameters in the command line ARGV**: returns an array, each parameter in the command line for example: awk 'BEGIN {print ARGV [0]}' / etc/fstab / etc/issue where ARGV [0] is awk, fixed as the 0th parameter. ARGV [1] is / etc/fstab,ARGV [2] is / etc/issue for example: awk-v FS=':''{print $1}'- v OFS=':' / etc/passwd specifies a colon as the input delimiter. Same as awk-F:...

Custom variable method 1 var=value (case sensitive) method 2: defined in program

For example: awk-v test='hello' 'BEGIN {print test}' awk 'BEGIN {test='hello' print test}'

6. Commonly used ACTION commands

Print output format: print item1,item2... Note: use commas as delimiters; output item can be strings, built-in variables, awk expressions; if item is omitted, the entire line of $0 is displayed

Printf formatted output: printf FORMAT, item1, item2... Bitwise in the format. Note: format must be given; if you need to wrap, you must show and write; in format, you need to specify a format character for each subsequent item.

Expressions

Control statements: control statement if,while if (condition) {statement} if (condition) {statement} else {statements} while (condition) {statements} do {statements} while (condition) for (expr1;expr2;expr3) {statements} break continue delete array [index] delete array delete the entire array exit exit statement

Compound statements: combining statements

Input statements: input statement

Output statements: output statement format character:  % c: display character ASCII value  % d: display decimal integer  % e: scientific numeric display  % f: display as floating point  % g: display floating point  % s: display string  % u: display unsigned integer  %: show% self modifier:   # [. # ]: the first number is used to control the width of the display character The second number represents the precision of the decimal (for floating-point numbers) Output default right-aligned s, left-aligned:%-15s: indicates a positive or negative sign; operator:   arithmetic operator: +-/ *; + x converts a string to a numeric value;-x changes to a negative number   string operator: string concatenation (no operator)   copy operator: =, + =,-=, / =, + +,-  comparison operator: > Pattern match:   ~: whether the string on the left side is matched by the pattern  ! ~: whether the string on the left side cannot be matched by the pattern matching logical operators:   &: and   | |: or  !: non-function calls:   function_name (arg1, arg2,...) Conditional expressions:   selector?true_exp:false_exp is the same as the ternary operator

Operation example

# generally speaking, printing stateless content is placed in BEGIN and END blocks awk-v begin= "hello"-v end= "ok"-F: 'BEGIN {print begin}; {print $1, $NF}; END {print end}' / etc/passwd5.3. Advanced usage and examples of awk awk commonly used built-in variables

$1: for the first column $NF: for the last column $NR: for line numbers commonly used conditions

1) / specify content /

In this way, you can match to the line with "specified content". The item with $# is not added to the condition. It is recommended that you do not use regularity, and there are exceptions.

Awk-F:'/ nologin/ {print $0}'/ etc/passwd # matches to line seq 100 containing the nologin keyword | awk'/ 1 / {print $1}'2) $# = / specified content /

In this way, column # matches the specified content

Awk-F:'$1=/bin/ {print $0}'/ etc/passwd3) $# ~ / specified content /

This method is used to specify the column fuzzy match (regular match) to specify the content and to get the row.

Awk-F:'$1~/dae/ {print $1}'/ etc/passwd # forward selection awk-F:'$1 reverse selection Dae / {print $1}'/ etc/passwd # reverse selection 4) value judgment

Use >, =

Awk-F:'$3 > = 10 {print $1}'/ etc/passwd5) logical judgment

Use & &, | | to make a logical decision.

Awk-F:'$3 > = 5 & & $36) if condition judgment

Awk-F:'{if ($NF~/nologin$/) {iTunes +} else {jacks +}}; END {print I, j}'/ etc/passwd # Note if-else condition judgment is put in {} 7) Dictionary use

You can define array types in awk for statistics.

Awk'{ip [$1] +}; END {for (i in ip) {print I, ip [I]}} 'access.log # parsing: set the first column ip to the key of the dictionary and increment 1 when the same ip occurs, which is used to count all ip counts. The key corresponding to each dictionary is fetched in the # for loop and printed using the print block. Note the isolation of curly braces. # QQ level duration # Statistical level (30 # 1234 1223 # 1234 10 122 # 1233 92 4212 # 1233 42 4252 # 1239 87 2313 # 1233 56 1121 # 1231 19 45 # 1235 45679 cat data | awk'$2 > = 30 thanks for reading this article on "how to use the grep,sed,awk command of Linux". I believe you all have a certain understanding of the knowledge of "how to use the grep,sed,awk command of Linux". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report