What are the commands for text processing in Linux 04/29 Update SLTechnology News&Howtos

What are the commands for text processing in Linux

2025-04-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "what are the commands used for text processing in Linux". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Awk

Basic concept

Awk treats a file (or other input stream, such as redirected input) as a recordset, each line as a record, and a string separated by a space (or\ t, or a user-specified delimiter) as a field. This seems to treat document records as a database. However, awk is still processed line by line as a behavior unit. This example demonstrates the contents of the following file, named s.txt:

The code is as follows:

Zhangsan 1977 male computer 83

Lisi 1989 male math 99

Wanglijiang 1990 female chinese 78

Xuliang 1977 male economic 89

Xuxin 1986 female english 99

Wangxuebing 1978 male math 89

Lichang 1989 male math 99

Wanglijiang 1990 female chinese 78

Zhangsansan 1977 male computer 83

Langxuebing 1978 male math 89

Lisibao 1989 male math 99

Xiaobao 1990 female chinese 78

The five fields in a line represent name, year of birth, gender, subject and score, which is a very traditional and typical report file.

Awk basic syntax: awk 'pattern1 {command1;command 2... ; command 3} pattern2 {command … }'

Pattern represents the pattern used to filter records, but regular expressions, relational expressions, or nothing (indicates that all records are selected)

Each row record selected by pattern is operated by the command command enclosed in curly braces, split between command. If there can be nothing in the curly braces, the default is print to output the entire line of records. Comamnd can be output, arithmetic operation, logic operation, loop control and so on.

First look at a few examples to establish an intuitive understanding of awk commands.

The code is as follows:

Awk'/ 1990 Compact 's.txt # / / Direct output of students born in 1990

Awk'/ chinese/ {print "Chinese"; print "Chinese"} 's.txt # / / one pair of chinese courses output two lines of "Chinese +"

Awk'20 > 1 {print "Yes"} 's.txt # / / output Yes per line because 20 > 0

Awk 'BEGIN {print "Result of the quiz:\ n"} {print} END {print "- -"}' s.txt

Results:

In this example, there are three curly braces corresponding to three patterns. BEGIN and END are special modes, which act before the beginning of the record and after the end of the record.

Variables: command mentioned above can be arithmetic operations, operations, etc., then since there are operations, there are constant variables, awk can customize variables (no need to declare in advance, but it is best to initialize it in BEGIN). Awk also maintains a set of program variables:

Variable

Description

, 0

Current record

$1, $2,... $n

Fields of the current record

FILENAME

Current file name

Enter the delimiter of the field, which can be modified by-F. For example, replace the space with | through sed, and then pipe it to awk:

Sed's / / | / g 's.txt | awk-F' |'/ chinese/ {print FILENAME, $1, $5}'

Number of fields in the current record

Current record number

OFS

Output field delimiter

ORS

Output record delimiter

Record delimiter, default to newline character

Give some examples to illustrate the use of these variables:

The code is as follows:

Awk'$4 courses = "chinese" {print NR, $1, $4, $5} 's.txt # / / the fourth field subject is the record number, student name, subject and grade of chinese.

Awk'$2accountables1990 / {print $1} 's.txt # / / find out the name of the student born in 1990, ~ indicates matching regular expression

Awk'$2percent regular expression 1990 / {print $1} 's.txt # / / find out the name of a student who was not born in 1990,! ~ means that the regular expression does not match

Awk'$2 > "1985" {print $1, $2} 's.txt # / / find out the name and age of students born in 1985.

Awk 'END {print "total:" NR "\ n -"}' s.txt

Awk 'BEGIN {goodChinese=0; goodMath=0} ($4 million = "chinese" | | $5 > 90) {goodChinese++} END {print ""}

Sed

Sed '2Query 5d' file displays the file file, excluding 2-5 lines, but does not report an error when the number of lines exceeds the actual number of lines in the file.

Sed'/ 10 [1-4] / d 'file displays the file file, removing the lines containing 101104.

Sed'2 file displays the file, showing only the first line. Sed'2 recorded file displays only the lines except the first line.

Sed'/ ^ * $/ d file removes blank lines from the file.

Sed-n'/ 10 [1-4] / p 'file

Only lines containing 101104 in the file file are displayed. (n and p must be used at the same time, otherwise only p displays all files and lines found one more time)

Sed-n'5p 'file shows only the fifth line of the file

Sed's mod _ ing _ G 'file replaces moding with moden

Sed-n's / ^ west / north/p' file replaces the line at the beginning of west with north and displays it.

Sed's / [0-9] [0-9] [0-9] $/ & .5 / 'file replaces the line in the file file that ends with three digits with the original number plus ".5", and & represents the searched string.

Sed 's/moding/\ 1en/g file encapsulates mod as pattern 1 in parentheses and replaces it.

Sed's file... delete the last three characters of each line.

Sed's / ^... / / 'file deletes the first three characters of each line.

Sed's substituting modulated file replaces moding with the # after moden,s represents the delimiter between the search string and the replacement string.

Sed-n'/ 101Charger file 105Compact shows matching lines from 101to 105Charger. If only the matching line of 101 is found, then from the matching line of 101 to the end of the file.

Sed-n'2 Magneto 999max p 'file shows from the second line to the matching line.

Sed'/101/,/105/s/$/ 20050119 file adds "20050119" content from the matching line of 101to the end of the matching line of 105.

Sed-e '1re3D'-e 's/moding/moden/g'file deletes 1-3 lines of the file before replacing it.

Sed-e'/ ^ # /! d 'file displays the lines where the file begins with #.

Sed'/ 101 newfile' file adds the contents of the file newfile at each matching line

The sed'/ 101 outside newfile' file writes matching lines to the newfile.

Sed'/ 101 new text' file adds a new line after matching the row.

Sed'/ 101 Compact I 'new text' file adds a new line before the matching row.

Sed'/ 101 new text' file c replace the matching line with the new line.

A, b, c and d are replaced by ABCD respectively by sed'ABCD 'Universe 'file.

Exit when sed '5q' file is displayed at line 5.

Sed'/ 101 / {n; s file find the next line (n) of the matching line in the file and replace it.

Sed'/ 101 / {sqqq; 'file finds the first matching line in the file, replaces it, and then exits.

Sed-e'/ 101 / {h; d;}'- e'/ 104 / {G;} 'file finds matching lines in the file and stores them in a cache, then after matching lines.

Sed-e'/ 101 / {h; d;}'- e'/ 104 / {g;} 'file finds matching lines in the file and then replaces 104matching lines in a cache.

Sed-e'/ 101 bind h'-e'$G' file places the last matching line at the end of the file.

Sed-e'/ 101 bind h'-e'$g 'file replaces the last line of the file with the last matching line.

Sed-e'/ 101Compact h'-e'/ 104Universe file find matching lines in the file and store them in a cache, then swap them with 104matching lines.

Echo-ltr 1.txt | sed's/ ^. * / / 'find the file name

Grep

Common grep options

-c outputs only the count of matching rows. / / this is sometimes very useful, so you don't need to wc-l.

-I is case-insensitive (for single characters only).

-h does not display file names when querying multiple files.

When querying multiple files, only the file names that contain matching characters are output.

-n displays matching lines and line numbers.

-s does not display error messages that do not exist or have no matching text.

-v displays all lines that do not contain matching text.

Examples

The code is as follows:

Grep-v "Sort" tab2

Show all lines that do not contain matching text

The code is as follows:

Grep-n "Sort" tab2

Display matching lines and line numbers

The code is as follows:

Grep-c "Sort" tab2

Output only the count of matching rows

Exact match:

The code is as follows:

Grep "01" > "tab2"

The code is as follows:

Grep-in "code" tab2

Ignore case

Multiple filtering

The code is as follows:

Grep-in "code" tab2 | grep "02"

In addition, the grep family also includes fgrep and egrep. Fgrep is fix grep, which allows you to find strings instead of a pattern, and is fast and suitable for retrieving large amounts of data; egrep is an extended grep, supporting basic and extended regular expressions, available () and |, etc., but does not support the application of Q mode range and some corresponding more standardized patterns.

The code is as follows:

Echo aAA123bbb | egrep'[0-9] *'

The code is as follows:

Echo AAA123bbb | egrep-I'^ a'

This is the end of the content of "what are the commands for text processing in Linux?" Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.