What are the common tools for dealing with text under Linux 07/02 Update SLTechnology News&Howtos

What are the common tools for dealing with text under Linux

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article is to share with you about the common tools for dealing with text under Linux. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Find file lookup

Look for txt and pdf files, find txt and pdf files

Find. \ (- name "* .txt"-o-name "* .pdf"\)-print

Regular way to find .txt and pdf

Find. -iregex ". *\ (\ .txt |\ .pdf\) _ FCKpd___1quot; #-iregex: ignore case regularities

Negative parameter, find all non-txt text

Find. !-name "* .txt"-print

Specify the search depth to print the file in the current directory (depth 1)

Find. -maxdepth 1-type f

Custom search search by type:-type f file / l symbolic link

Find. -type d-print / / lists only all directories

Search by time:-atime access time (in days, minutes is-amin, similar below)-mtime modification time (content modified)-ctime change time (metadata or permission change)

All files accessed in the last 7 days:

Find. -atime 7-type f-print

Search by size: W word k M G, looking for files greater than 2k

Find. -type f-size + 2k

Find by permission:

Find. -type f-perm 644-print / / find all files with executable permissions

Find by user:

Find. -type f-user weber-print / / find the files owned by user weber

Delete all swp files in the current directory after the subsequent actions are found:

Find. -type f-name "* .swp"-delete

Perform actions (powerful exec)

Find. -type f-user root-exec chown weber {}\; / / change ownership under the current directory to weber

Note: {} is a special string. For each matching file, {} will be replaced with the corresponding file name; eg: copy all found files to another directory:

Find. -type f-mtime + 10-name "* .txt"-exec cp {} OLD\

Combine multiple commands tips: if you need to execute multiple commands later, you can write multiple commands into a script, and then execute the script when-exec is called.

-exec. / commands.sh {}\

The delimiter of-print defaults to'\ n'as the delimiter of the file;-print adds a carriage return newline character after each output, while-print0 does not use'\ 0' as the delimiter of the file, so you can search for files containing spaces. Files in the current directory are sorted from large to small (including hidden files), and the file name is not ".":

Find. -maxdepth 1!-name "."-print0 | xargs-0 du-b | sort-nr | head-10 | nl

Grep text search

Grep match_patten file / / default access to matching lines commonly used parameters-o output only matching lines of text VS-v outputs only lines of text that do not match-c statistics on the number of times the file contains text

Grep-c "text" filename

-n print matching line numbers-I ignore case when searching-l only print file names recursive search for text in multi-level directories (programmer's favorite search code):

Grep "class". -R-n

Match multiple patterns

Grep-e "class"-e "vitural" file

Grep outputs the file name with\ 0 as the Terminator: (- z)

Grep "test" file*-lZ | xargs-0 rm

Xargs command line argument conversion

Xargs can convert input data into command-line arguments for specific commands; this can be combined with many commands. For example, grep, such as find;, converts multi-line output to single-line output.

Cat file.txt | xargs

\ nThe delimiter between multiple lines of text converts a single line to multiple lines of output.

Cat single.txt | xargs-n 3-n: specify the number of fields displayed per row

The xargs parameter description-d defines the delimiter (the default is\ n for the space multiline delimiter)-n specifies that the output is multiline-I {} specifies the replacement string, which is replaced when xargs is extended, for eg when the command to be executed requires more than one argument:

Cat file.txt | xargs-I {}. / command.sh-p {}-1

-0: specify\ 0 as the input delimiter eg: count the number of program lines

Find source_dir/-type f-name "* .cpp"-print0 | xargs-0 wc-l

Sort sorting

Field description:-n sort by number VS-d sort by dictionary order-r sort in reverse order-k N specifies sort by Nth column eg:

Sort-nrk 1 data.txtsort-bd data / / ignores leading white space characters such as spaces

Uniq deduplicates lines

Sort unsort.txt | uniq

Count the number of times each line appears in the file

Sort unsort.txt | uniq-c

Find duplicate lines

Sort unsort.txt | uniq-d

You can specify the repetition to be compared in each line:-s start position-w comparison characters

Use tr for conversion

General usage

Echo 12345 | tr'0-9''9876543210' / / encryption / decryption conversion, replacing the corresponding characters cat text | tr'\ t'/ / tabs to spaces

Tr delete character

Cat file | tr-d'0-9' / / Delete all digits

-c complement set

Cat file | tr-c'0-9'/ / get all numeric cat file in the file | tr-d-c'0-9\ n'/ / Delete non-numeric data

Tr compressed character tr-s A repetitive character that occurs in compressed text; most commonly used to compress excess spaces.

Cat file | tr-s''

❝

Various character classes are available in the character class tr: alnum: alphanumeric alpha: alphanumeric digit: numeric space: blank character lower: lowercase upper: uppercase cntrl: control (non-printable) character print: printable character

Usage: tr [: class:] [: class:]

Eg: tr'[: lower:]'[: upper:]'

Cut splits text by column

Intercept columns 2 and 4 of the document:

Cut-f2 filename 4

Go to all the columns of the file except column 3:

Cut-f3-- complement filename

-d specifies the delimiter:

Cat-f2-d ";" filename

The range taken by cut is N-Nth field to the end-M the first field is M N M N to M field

Units taken by cut-b in bytes-c in characters-f in fields (using delimiters)

Cut-C1-5 file / / print the first to 5 characters cut-C file 2 file / / the first 2 characters of print

Paste splices text by column

Splice two pieces of text together in columns

Cat file112cat file2colinbookpaste file1 file21 colin2 book

The default delimiter is a tab, and you can use-d to indicate the delimiter paste file1 file2-d "," 1jccolin 2book ".

Wc tools for counting lines and characters

Wc-l file / / count rows wc-w file / / count words wc-c file / / count characters

Sharp weapon of sed text replacement

First replacement

Seg's file _ text _

Global replacement

Seg's TableText replacebound textCompact g'file

After the default replacement, output the replaced content. If you need to directly replace the original file, use-I:

Seg-I's file. Repalceeded text.

Remove blank lines:

Sed'/ ^ $/ d'file

Variable conversion matched strings are referenced by the tag &.

Echo this is en example | seg's /\ walled / [&] / this / is / [&] / this / is / [en] [example]

Substring matching tag the first matching parenthesis content is referenced using the tag\ 1

Sed 's/hello\ ([0-9]\) /\ 1ax'

The double quotation mark evaluation sed is usually quoted in single quotation marks; you can also use double quotation marks, which evaluate the expression after using double quotation marks:

Sed's Universe varamel HLLOE Universe

When using double quotes, we can specify variables in the sed style and the replacement string

P=pattenr=replacedecho "line con a patten" | sed "s/$p/$r/g" _ FCKpd___40gt;line con a replaced

Other sample string insert characters: convert each line of text (PEKSHA) to PEK/SHA

Sed's / ^.\ {3\} / &\ / / g 'file

Awk data flow processing tool

Awk script structure

Awk 'BEGIN {statements} statements2 END {statements}'

Mode of work 1. Execute the Chinese sentence block of begin; 2. Read a line from the file or stdin, then execute statements2, and repeat the process until all the files have been read; 3. Execute end statement block

Print prints the current line when using print with no parameters, the current line is printed

Echo-e "line1\ nline2" | awk 'BEGIN {print "start"} {print} END {print "End"}'

When print is divided by commas, parameters are delimited by spaces

Echo | awk'{var1 = "v1"; var2 = "V2"; var3= "v3";\ print var1, var2, var3;}'_ _ FCKpd___43gt;v1 V2 v3

How to use the-splice character ("" as the splicer)

Echo | awk'{var1 = "v1"; var2 = "V2"; var3= "v3";\ print var1 "-" var2 "-" var3;}'_ FCKpd___44gt;v1-V2-v3

Special variable: NR NF 1 indicates the number of records, corresponding to the current line number during execution; indicates the number of fields, which corresponds to the total number of fields that should be preceded during execution; 0: this variable contains the text content of the current line during execution; the text content of the first field; 2: the text content of the second field

Echo-e "line1 f2 f3\ nline2\ nline 3" | awk'{print NR ":" $0 "-" $1 "-" $2}'

Print the second and third fields of each line:

Awk'{print $2, $3} 'file

Count the number of lines in the file:

Awk 'END {print NR}' file

Accumulate the first field of each line:

Echo-e "1\ n 2\ n 3\ n 4\ n" | awk 'BEGIN {num = 0; print "begin";} {sum + = $1;} END {print "="; print sum}'

Pass external variables

Var=1000 echo | awk'{print vara} 'vara=$var # input from stdinawk' {print vara} 'vara=$var file # input from file

Use style to filter awk'NR on awk-processed row

Set delimiter use-F to set delimiter (default is space)

Awk-F:'{print $NF}'/ etc/passwd

Read command output using getline to read the output of an external shell command into the variable cmdout

Echo | awk'{"grep root / etc/passwd" | getline cmdout; print cmdout}'

Using loops in awk

For (i in array) {print array [I];} for (iprint $I)

Print lines in reverse order: (implementation of the tac command)

Seq 9 |\ awk'{lifo [NR] = $0; lno=NR}\ END {for (; lno >-1) {print lifo [lno];}}'

Awk implements head and tail commands

Head: awk'NR filenametail: awk'{buffer [NR] = $0;} END {for

Print the specified column in awk mode:

Ls-lrt | awk'{print $6}'

Cut implementation

Ls-lrt | cut-f6

Print the specified text area to determine the line number

Seq 100 | awk 'NR==4,NR==6 {print}'

Make sure the text prints the text between start_pattern and end_pattern

Awk'/ start_pattern/, / end_pattern/' filenameeg: seq 100 | awk'/ 13 charger awk / etc/passwd / awk'/ mai.*mail/,/news.*news/'

Awk commonly used built-in function index (string,search_string): returns the position where search_string appears in string sub (regex,replacement_str,string): replaces the first regular match with replacement_str; match (regex,string): checks whether regular expressions can match strings; length (string): returns string length

Echo | awk'{"grep root / etc/passwd" | getline cmdout; print length (cmdout)}'

Printf is similar to printf in c language, formatting the output

Seq 10 | awk'{printf "- >% 4s\ n", $1}'

Iterate over the lines, words, and characters in the file. While loop method for each line in the iterative file

While read line;doecho $line;done while read line;doecho $line;done)

Awk method:

Cat file.txt | awk'{print}'

two。 Iterate over each word in a line

For word in $line;do echo $word;done

\ 3. Iterate over each character: extract a character from the string; text slice) {# word}: returns the length of the variable word

For ((iTuno +) doecho ${# word}; iTunes +); done

Thank you for reading! This is the end of this article on "what are the common tools for dealing with text under Linux?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.