How to use Shell text processing tool under Linux 07/15 Update SLTechnology News&Howtos

How to use Shell text processing tool under Linux

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article introduces the relevant knowledge of "how to use Shell text processing tools under Linux". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Unlike windows, the operation habit under Linux is made up of many different commands. This article will introduce the most commonly used tools for dealing with text with Shell under Linux: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk.

The examples and parameters provided are the most commonly used and practical.

My principle for shell scripts is to write commands on one line, not more than 2 lines as much as possible; if there are more complex task requirements, consider python.

Find file lookup

Find txt and pdf files

Find. \ (- name "* .txt"-o-name "* .pdf"\)-print

Regular way to find .txt and pdf

Find. -regex ". *\ (\ .txt |\ .pdf\) _ FCKpd___1quot

-iregex: ignore case regularities

Negative parameter

Find all non-txt text

Find. !-name "* .txt"-print

Specify search depth

Print out the file of the current directory (depth 1)

Find. -maxdepth 1-type f

Custom search

Search by type:

Find. -type d-print / / lists only all directories

-type f file / l symbolic link

Search by time:

-atime access time (in days, minutes is-amin, similar below)

-mtime modification time (content modified)

-ctime change time (metadata or permission change)

All files accessed in the last 7 days:

Find. -atime 7-type f-print

Search by size:

W word k M G

Look for files greater than 2k

Find. -type f-size + 2k

Find by permission:

Find. -type f-perm 644-print / / find all files with executable permissions

Find by user:

Find. -type f-user weber-print// to find the files owned by user weber

The subsequent action after finding it.

Delete:

Delete all swp files in the current directory:

Find. -type f-name "* .swp"-delete

Perform actions (powerful exec)

Find. -type f-user root-exec chown weber {}\; / / change ownership under the current directory to weber

Note: {} is a special string. For each matching file, {} will be replaced with the corresponding file name.

Eg: copy all the files found to another directory:

Find. -type f-mtime + 10-name "* .txt"-exec cp {} OLD\

Combine multiple commands

Tips: if you need to execute multiple commands later, you can write multiple commands into a single script. Then execute the script when-exec is called

-exec. / commands.sh {}\

Delimiter of-print

Defaults to'\ n' as the delimiter of the file

-print0 uses'\ 0' as the delimiter of the file so that you can search for files that contain spaces

Grep text search

Grep match_patten file / / default access to matching lines

Common parameters

-o output only matching lines of text VS-v outputs only lines of text that do not match

-c Statistics the number of times the file contains text

Grep-c "text" filename

-n print matching line number

-I ignore case when searching

-l prints only file names

Recursive search for text in a multi-level directory (programmer's favorite search code):

Grep "class". -R-n

Match multiple patterns

Grep-e "class"-e "vitural" file

Grep outputs the file name with\ 0 as the Terminator: (- z)

Grep "test" file*-lZ | xargs-0 rm

Xargs command line argument conversion

Xargs can convert input data into command-line arguments for specific commands; this can be combined with many commands. Like grep, like find.

Convert multi-line output to single-line output

Cat file.txt | xargs

\ nThe delimiter between multiple lines of text

Convert single lines to multiple lines of output

Cat single.txt | xargs-n 3

-n: specify the number of fields displayed per row

Xargs parameter description

-d defines the delimiter (the default is the space multiline delimiter\ n)

-n specifies that the output is multiple lines

-I {} specifies the replacement string, which is replaced when the xargs is extended, for use when the command to be executed requires more than one argument

Eg:

Cat file.txt | xargs-I {}. / command.sh-p {}-1

-0: specify\ 0 as the input delimiter

Eg: counting the number of program lines

Find source_dir/-type f-name "* .cpp"-print0 | xargs-0 wc-l

Sort sorting

Field description:

-n sort by number VS-d sort by dictionary order

-r sort in reverse order

-k N specifies to sort by Nth column

Eg:

Sort-nrk 1 data.txtsort-bd data / / ignores leading white space characters such as spaces

Uniq eliminates duplicate lines

Eliminate duplicate lines

Sort unsort.txt | uniq

Count the number of times each line appears in the file

Sort unsort.txt | uniq-c

Find duplicate lines

Sort unsort.txt | uniq-d

You can specify the repetition to be compared in each line:-s start position-w comparison characters

Use tr for conversion

General usage

Echo 12345 | tr'0-9''9876543210' / / encryption / decryption conversion, replacing the corresponding characters cat text | tr'\ t'/ / tabs to spaces

Tr delete character

Cat file | tr-d'0-9' / / Delete all digits

-c complement set

Cat file | tr-c'0-9'/ / get all numeric cat file in the file | tr-d-c'0-9\ n'/ / Delete non-numeric data

Tr compressed characters

Repetitive characters that occur in tr-s compressed text; most commonly used to compress excess spaces

Cat file | tr-s''

Character class

Various character classes are available in tr:

Alnum: letters and numbers

Alpha: letter

Digit: numeric

Space: White space character

Lower: lowercase

Upper: uppercase

Cntrl: control (non-printable) character

Print: printable character

Usage: tr [: class:] [: class:]

Eg: tr'[: lower:]'[: upper:]'

Cut splits text by column

Intercept columns 2 and 4 of the document:

Cut-f2 filename 4

Go to all the columns of the file except column 3:

Cut-f3-- complement filename

-d specifies the delimiter:

Cat-f2-d ";" filename

The range taken by cut

N-Nth field to the end

-M the first field is M

Nmurm M N to M fields

Units taken by cut

-b in bytes

-c in characters

-f in fields (using delimiters)

Eg:

Cut-C1-5 file / / print the first to 5 characters cut-C file 2 file / / the first 2 characters of print

Paste splices text by column

Splice two pieces of text together in columns

Cat file112cat file2colinbookpaste file1 file21 colin2 book

The default delimiter is a tab, which can be specified with-d

Paste file1 file2-d ","

1,colin

2,book

Wc tools for counting lines and characters

Wc-l file / / count rows

Wc-w file / / count words

Wc-c file / / count the number of characters

Sharp weapon of sed text replacement

First replacement

Seg's file _ text _

Global replacement

Seg's TableText replacebound textCompact g'file

After the default replacement, output the replaced content. If you need to directly replace the original file, use-I:

Seg-I's file. Repalceeded text.

Remove blank lines:

Sed'/ ^ $/ d'file

Variable conversion

Matched strings are referenced by the tag &.

Echo this is en example | seg's /\ walled / [&] / this / is / [&] / this / is / [en] [example]

Substring matching tag

The first matching parenthesis is referenced using the tag\ 1

Sed 's/hello\ ([0-9]\) /\ 1ax'

Double quotation mark evaluation

Sed is usually quoted in single quotation marks, or double quotation marks, which evaluate the expression after using double quotation marks:

Sed's Universe varamel HLLOE Universe

When using double quotes, we can specify variables in the sed style and the replacement string

Eg:p=pattenr=replacedecho "line con a patten" | sed "s/$p/$r/g" _ FCKpd___40gt;line con a replaced

Other exampl

String insert character: converts each line of text (PEKSHA) to PEK/SHA

Sed's / ^.\ {3\} / &\ / / g 'file

Awk data flow processing tool

Awk script structure

Awk 'BEGIN {statements} statements2 END {statements}'

Mode of work

1. Execute statement blocks in begin

two。 Read a line from a file or stdin, then execute statements2, and repeat the process until all the files have been read

3. Execute end statement block

Print prints the current line

When using print with no parameters, the current line is printed

Echo-e "line1\ nline2" | awk 'BEGIN {print "start"} {print} END {print "End"}'

When print is divided by commas, parameters are delimited by spaces

Echo | awk'{var1 = "v1"; var2 = "V2"; var3= "v3";\ print var1, var2, var3;}'_ _ FCKpd___43gt;v1 V2 v3

How to use the-splice character ("" as the splicer)

Echo | awk'{var1 = "v1"; var2 = "V2"; var3= "v3";\ print var1 "-" var2 "-" var3;}'_ FCKpd___44gt;v1-V2-v3

Special variable: NR NF $0 $1 $2

NR: indicates the number of records, corresponding to the current line number during execution

NF: indicates the number of fields, and always corresponds to the number of fields that should be in front of it during execution

$0: this variable contains the text content of the current line during execution

$1: the text content of the first field

$2: the text content of the second field

Echo-e "line1 f2 f3\ nline2\ nline 3" | awk'{print NR ":" $0 "-" $1 "-" $2}'

Print the second and third fields of each line:

Awk'{print $2, $3} 'file

Count the number of lines in the file:

Awk 'END {print NR}' file

Accumulate the first field of each line:

Echo-e "1\ n 2\ n 3\ n 4\ n" | awk 'BEGIN {num = 0; print "begin";} {sum + = $1;} END {print "="; print sum}'

Pass external variables

Var=1000echo | awk'{print vara} 'vara=$var # input from stdinawk' {print vara} 'vara=$var file # input from file

Use style to filter marching rows processed by awk

Awk'NR < 5'# Line number is less than 5

Print awk 'NR==1,NR==4 {print}' file # with line numbers equal to 1 and 4

Awk'/ linux/' # lines containing linux text (super powerful, which can be specified with regular expressions)

Awk'! / linux/' # Lines that do not contain linux text

Set delimiter

Use-F to set the delimiter (default is a space)

Awk-F:'{print $NF}'/ etc/passwd

Read command output

Using getline, read the output of the external shell command into the variable cmdout

Echo | awk'{"grep root / etc/passwd" | getline cmdout; print cmdout}'

Using loops in awk

For (iSuppli)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.