Introduction and use of text processing tools in shell scripts 07/06 Update SLTechnology News&Howtos

Introduction and use of text processing tools in shell scripts

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

This article gives you a detailed introduction to the use of text processing tools in shell scripts. Most of the tools may be often used by you, so I would like to share with you a detailed summary. I hope you can have a deep understanding of the use of shell text processing tools.

1. Grep tool

Grep is a row filtering tool; it is used to filter rows based on keywords

Syntax and options

Syntax:

# grep [option] 'keyword' file name

Common options:

OPTIONS:-I: case-insensitive-v: finds lines that do not contain the specified content Reverse selection-w: search by word-o: print matching keyword-c: count the number of lines matched-n: display line number-r: traverse the directory search layer by layer-A: show the matching line and how many lines are followed-B: show the matching line and the number of lines in front-C: show the number of lines before and after the matching line- L: list only matching file names-L: list mismatched file names-e: use regular match-E: use extended regular match ^ key: start with keyword key$: end with keyword ^ $: match blank line-- color=auto: you can add color to the keyword part found

Color display (alias settings):

Temporary settings: # alias grep='grep-- color=auto' / / permanent settings are effective only for the current terminal and current users: 1) Global (effective for all users) vim / etc/bashrcalias grep='grep-- color=auto'source / etc/bashrc2) Local (for a specific user) vim ~ / .bashrcalias grep='grep-- color=auto'source ~ / .bashrc

Examples are as follows:

Note: do not use / etc/passwd file directly, copy it to / tmp to do experiments! = =

# grep-i root passwd ignores case matching lines containing root # grep-w ftp passwd exact matching ftp words # grep-w hello passwd exact matching hello words Add lines containing hello to file # grep-wo ftp passwd print matching keywords ftp# grep-n root passwd print lines matching to root keywords good # grep-ni root passwd ignore case matching statistics line containing keyword root # grep-nic root passwd ignore Case matching statistics the number of lines containing the keyword root # grep-I ^ root passwd ignores case matching lines beginning with root # grep bash$ passwd matches lines ending with bash # grep-n ^ $passwd matches blank lines and prints line number # grep ^ # / etc/vsftpd/vsftpd.conf match to Line # grep-v ^ # / etc/vsftpd/vsftpd.conf matches lines that do not begin with the # sign # grep-A 5 mail passwd match contains the mail keyword followed by 5 lines # grep-B 5 mail passwd matching contains the mail keyword and its first 5 lines # grep-C 5 mail passwd match contains the mail keyword and its 5 lines before and after the cut tool

Cut is a column interception tool for column interception

Syntax and options

Syntax:

# cut option file name

Common options:

-c: split in characters. Intercept-d: custom delimiter. Default is tab\ tmurf: used with-d to specify which area to intercept.

Examples are as follows:

# cut-d:-F1 1.txt with: colon split, intercept column 1 content # cut-d:-F1 1.txt 6 1.txt 7 with: colon split Intercept 1,6 cut 7 columns # cut-c4 1.txt intercept the fourth character of each line in the file # cut-c1-4 1.txt intercept 1-4 characters of each line in the file # cut-c4-10 1.txt intercept 4-10 characters of each line in the file # cut-c5-1.txt intercept all subsequent characters starting with the fifth character 3. Sort tool

The sort tool is used for sorting; it takes each line of a file as a unit, compares it from the first character to the back, compares it with the ASCII code value in turn, and outputs them in ascending order.

Syntax and options-u: remove duplicate lines-r: sort in descending order, the default is ascending-o: output the sort results to a file, similar to the redirect symbol >-n: sort by number, the default is sort by character-t: delimiter-k: column N-b: ignore leading spaces. -R: random sort, the result of each run is different

Give examples to illustrate

# sort-n-t:-K3 1.txt sort by user's uid in ascending order # sort-nr-t:-K3 1.txt by user's uid in descending order # sort-n 2.txt by number sort # sort-nu 2.txt sort by number and remove # sort-nr 2.txt # sort-nru 2.txt # sort-nru 2.txt # sort-n 2.txt-o 3.txt sort by number and redirect the results to file # sort-R 2.txt # sort-u 2.txt 4.uniq tool

Uniq is used to remove = = continuous = = repeat = = lines

Common options:-I: ignore case-c: count the number of duplicate lines-d: show only duplicate lines example: # uniq 2.txt # uniq-d 2.txt # uniq-dc 2.txt 5.tee tool

The tee tool reads from standard input and writes to standard output and files, that is, bidirectional override redirection (screen output | text input)

Option:-a two-way pursuit direction # echo hello world# echo hello world | tee file1# cat file1# echo 999 | tee-a file1# cat file1 6.diff tool

The diff tool is used to compare the differences of files line by line

Note: the way diff describes the difference between the two files is to tell us how to change the first file and match the second file.

Syntax and options

Syntax:

Diff [options] File 1 File 2

Common options:

Option meaning remarks-b does not check spaces-B does not check blank lines-I do not check case-w ignores all spaces-normal normal format display (default)-c context format display-u merge format display

Examples are as follows:

Compare two = = ordinary files = = similarities and differences, file preparation: [root@MissHou ~] # cat file1aaaa111hello world222333bbb [root@MissHou ~] # [root@MissHou ~] # cat file2aaahello111222bbb333world

1) normal display

Diff purpose: how to change file1 to match file2 [root@MissHou ~] # diff file1 file21c1,2 the first line of the first file needs to be changed (c=change) to match lines 1 to 2 of the second file

< aaaa 小于号""表示右边文件(file2)文件内容>

Hello3d3 the third line of the first file cannot match the third line of the second file until it is deleted (d=delete).

< hello world5d4 第一个文件的第5行删除后才能和第二个文件的第4行匹配< 3336a6,7 第一个文件的第6行增加(a=add)内容后才能和第二个文件的第6到7行匹配>

333 what needs to be added in the second file is 333 and world > world

2) context format display

[root@MissHou ~] # diff-c file1 file2 the first two lines mainly list the file names and timestamps that need to be compared The symbol in front of the file name * indicates that file1,--- means file2*** file1 2019-04-16 16V 26file1 05.748650262 + 0800Murray-file2 2019-04-16 16V 26V 30.470646030 + 0800 * I am the delimiter * 1pi 6 * * begins with *, 1PRI 6 represents 1 to 6 lines! Aaaa! Indicates that the bank needs to be modified to match the second file 111-hello world-indicates that the line needs to be deleted to match the second file 222-333-indicates that the line needs to be deleted to match the second file-begins with-indicates the file2 file, and 1Bing 7 indicates lines 1 to 7! Aaa indicates that the first file needs to be modified to match the second file! Hello indicates that the first file needs to be modified to match the second file. 111222 bbb+ 333 means that the first file needs to add a line to match the second file. + world means that the first file needs to add a line to match the second file.

3) merge format display

[root@MissHou ~] # diff-u file1 file2 the first two lines mainly list the file names and timestamps that need to be compared The symbol in front of the file name-indicates file1 + means file2--- file1 2019-04-16 16 aaaa+aaa+hello 111-hello world 26 bbb+333+world 05.748650262 + 0800 bbb+333+world + root@MissHou tmp 2019-04-16 16 V 26 bbb+333+world 30.470646030 + 0800 bike @-1 bbb+333+world comparison of two bbb+333+world = = directories are different = = the contents of the same files in the two directories are also compared by default [root@MissHou tmp] # diff dir1 dir2diff dir1/file1 dir2/file10a1 > helloOnly in dir1: test1 Compare the differences in files between the two directories There is no need to further compare the contents of the file. You need to add the-Q option [root@MissHou tmp] # diff-Q dir1 dir2Files dir1/file1 and dir2/file1 differOnly in dir1: file3Only in dir2: test1

Other tips:

Sometimes we need to use one file as the standard to modify other files, and when there are more changes, we can do it by patching.

1) find out the difference in the file first, and then output it to a file [root@MissHou ~] # diff-uN file1 file2 > file.patch-u: context mode-N: treat a file that does not exist as an empty file 2) patch different contents to the file [root@MissHou ~] # patch file1 file.patchpatching file file13) Test Verification [root@MissHou ~] # diff file1 file2 [root@MissHou ~] # 7. Paste tool

The paste tool is used to merge file lines

Common option:-d: custom spacer. Default is tab-s: serial processing, non-parallel

# # 8. Tr tool

Tr is used for character conversion, replacement and deletion; it is mainly used for the deletion of control characters in files = = or for character conversion = =

Syntax and options

Syntax:

Usage 1: the execution result of the command is handed over to tr for processing, where string1 is used for query and string2 is used for conversion processing # commands | tr 'string1'' string2' usage the content processed by 2:tr comes from a file, remember to use "

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.