In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about the common tools for dealing with text under Linux. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
Find file lookup
Look for txt and pdf files, find txt and pdf files
Find. \ (- name "* .txt"-o-name "* .pdf"\)-print
Regular way to find .txt and pdf
Find. -iregex ". *\ (\ .txt |\ .pdf\) _ FCKpd___1quot; #-iregex: ignore case regularities
Negative parameter, find all non-txt text
Find. !-name "* .txt"-print
Specify the search depth to print the file in the current directory (depth 1)
Find. -maxdepth 1-type f
Custom search search by type:-type f file / l symbolic link
Find. -type d-print / / lists only all directories
Search by time:-atime access time (in days, minutes is-amin, similar below)-mtime modification time (content modified)-ctime change time (metadata or permission change)
All files accessed in the last 7 days:
Find. -atime 7-type f-print
Search by size: W word k M G, looking for files greater than 2k
Find. -type f-size + 2k
Find by permission:
Find. -type f-perm 644-print / / find all files with executable permissions
Find by user:
Find. -type f-user weber-print / / find the files owned by user weber
Delete all swp files in the current directory after the subsequent actions are found:
Find. -type f-name "* .swp"-delete
Perform actions (powerful exec)
Find. -type f-user root-exec chown weber {}\; / / change ownership under the current directory to weber
Note: {} is a special string. For each matching file, {} will be replaced with the corresponding file name; eg: copy all found files to another directory:
Find. -type f-mtime + 10-name "* .txt"-exec cp {} OLD\
Combine multiple commands tips: if you need to execute multiple commands later, you can write multiple commands into a script, and then execute the script when-exec is called.
-exec. / commands.sh {}\
The delimiter of-print defaults to'\ n'as the delimiter of the file;-print adds a carriage return newline character after each output, while-print0 does not use'\ 0' as the delimiter of the file, so you can search for files containing spaces. Files in the current directory are sorted from large to small (including hidden files), and the file name is not ".":
Find. -maxdepth 1!-name "."-print0 | xargs-0 du-b | sort-nr | head-10 | nl
Grep text search
Grep match_patten file / / default access to matching lines commonly used parameters-o output only matching lines of text VS-v outputs only lines of text that do not match-c statistics on the number of times the file contains text
Grep-c "text" filename
-n print matching line numbers-I ignore case when searching-l only print file names recursive search for text in multi-level directories (programmer's favorite search code):
Grep "class". -R-n
Match multiple patterns
Grep-e "class"-e "vitural" file
Grep outputs the file name with\ 0 as the Terminator: (- z)
Grep "test" file*-lZ | xargs-0 rm
Xargs command line argument conversion
Xargs can convert input data into command-line arguments for specific commands; this can be combined with many commands. For example, grep, such as find;, converts multi-line output to single-line output.
Cat file.txt | xargs
\ nThe delimiter between multiple lines of text converts a single line to multiple lines of output.
Cat single.txt | xargs-n 3-n: specify the number of fields displayed per row
The xargs parameter description-d defines the delimiter (the default is\ n for the space multiline delimiter)-n specifies that the output is multiline-I {} specifies the replacement string, which is replaced when xargs is extended, for eg when the command to be executed requires more than one argument:
Cat file.txt | xargs-I {}. / command.sh-p {}-1
-0: specify\ 0 as the input delimiter eg: count the number of program lines
Find source_dir/-type f-name "* .cpp"-print0 | xargs-0 wc-l
Sort sorting
Field description:-n sort by number VS-d sort by dictionary order-r sort in reverse order-k N specifies sort by Nth column eg:
Sort-nrk 1 data.txtsort-bd data / / ignores leading white space characters such as spaces
Uniq deduplicates lines
Sort unsort.txt | uniq
Count the number of times each line appears in the file
Sort unsort.txt | uniq-c
Find duplicate lines
Sort unsort.txt | uniq-d
You can specify the repetition to be compared in each line:-s start position-w comparison characters
Use tr for conversion
General usage
Echo 12345 | tr'0-9''9876543210' / / encryption / decryption conversion, replacing the corresponding characters cat text | tr'\ t'/ / tabs to spaces
Tr delete character
Cat file | tr-d'0-9' / / Delete all digits
-c complement set
Cat file | tr-c'0-9'/ / get all numeric cat file in the file | tr-d-c'0-9\ n'/ / Delete non-numeric data
Tr compressed character tr-s A repetitive character that occurs in compressed text; most commonly used to compress excess spaces.
Cat file | tr-s''
❝
Various character classes are available in the character class tr: alnum: alphanumeric alpha: alphanumeric digit: numeric space: blank character lower: lowercase upper: uppercase cntrl: control (non-printable) character print: printable character
Usage: tr [: class:] [: class:]
Eg: tr'[: lower:]'[: upper:]'
Cut splits text by column
Intercept columns 2 and 4 of the document:
Cut-f2 filename 4
Go to all the columns of the file except column 3:
Cut-f3-- complement filename
-d specifies the delimiter:
Cat-f2-d ";" filename
The range taken by cut is N-Nth field to the end-M the first field is M N M N to M field
Units taken by cut-b in bytes-c in characters-f in fields (using delimiters)
Cut-C1-5 file / / print the first to 5 characters cut-C file 2 file / / the first 2 characters of print
Paste splices text by column
Splice two pieces of text together in columns
Cat file112cat file2colinbookpaste file1 file21 colin2 book
The default delimiter is a tab, and you can use-d to indicate the delimiter paste file1 file2-d "," 1jccolin 2book ".
Wc tools for counting lines and characters
Wc-l file / / count rows wc-w file / / count words wc-c file / / count characters
Sharp weapon of sed text replacement
First replacement
Seg's file _ text _
Global replacement
Seg's TableText replacebound textCompact g'file
After the default replacement, output the replaced content. If you need to directly replace the original file, use-I:
Seg-I's file. Repalceeded text.
Remove blank lines:
Sed'/ ^ $/ d'file
Variable conversion matched strings are referenced by the tag &.
Echo this is en example | seg's /\ walled / [&] / this / is / [&] / this / is / [en] [example]
Substring matching tag the first matching parenthesis content is referenced using the tag\ 1
Sed 's/hello\ ([0-9]\) /\ 1ax'
The double quotation mark evaluation sed is usually quoted in single quotation marks; you can also use double quotation marks, which evaluate the expression after using double quotation marks:
Sed's Universe varamel HLLOE Universe
When using double quotes, we can specify variables in the sed style and the replacement string
P=pattenr=replacedecho "line con a patten" | sed "s/$p/$r/g" _ FCKpd___40gt;line con a replaced
Other sample string insert characters: convert each line of text (PEKSHA) to PEK/SHA
Sed's / ^.\ {3\} / &\ / / g 'file
Awk data flow processing tool
Awk script structure
Awk 'BEGIN {statements} statements2 END {statements}'
Mode of work 1. Execute the Chinese sentence block of begin; 2. Read a line from the file or stdin, then execute statements2, and repeat the process until all the files have been read; 3. Execute end statement block
Print prints the current line when using print with no parameters, the current line is printed
Echo-e "line1\ nline2" | awk 'BEGIN {print "start"} {print} END {print "End"}'
When print is divided by commas, parameters are delimited by spaces
Echo | awk'{var1 = "v1"; var2 = "V2"; var3= "v3";\ print var1, var2, var3;}'_ _ FCKpd___43gt;v1 V2 v3
How to use the-splice character ("" as the splicer)
Echo | awk'{var1 = "v1"; var2 = "V2"; var3= "v3";\ print var1 "-" var2 "-" var3;}'_ FCKpd___44gt;v1-V2-v3
Special variable: NR NF 1 indicates the number of records, corresponding to the current line number during execution; indicates the number of fields, which corresponds to the total number of fields that should be preceded during execution; 0: this variable contains the text content of the current line during execution; the text content of the first field; 2: the text content of the second field
Echo-e "line1 f2 f3\ nline2\ nline 3" | awk'{print NR ":" $0 "-" $1 "-" $2}'
Print the second and third fields of each line:
Awk'{print $2, $3} 'file
Count the number of lines in the file:
Awk 'END {print NR}' file
Accumulate the first field of each line:
Echo-e "1\ n 2\ n 3\ n 4\ n" | awk 'BEGIN {num = 0; print "begin";} {sum + = $1;} END {print "="; print sum}'
Pass external variables
Var=1000 echo | awk'{print vara} 'vara=$var # input from stdinawk' {print vara} 'vara=$var file # input from file
Use style to filter awk'NR on awk-processed row
Set delimiter use-F to set delimiter (default is space)
Awk-F:'{print $NF}'/ etc/passwd
Read command output using getline to read the output of an external shell command into the variable cmdout
Echo | awk'{"grep root / etc/passwd" | getline cmdout; print cmdout}'
Using loops in awk
For (i in array) {print array [I];} for (iprint $I)
Print lines in reverse order: (implementation of the tac command)
Seq 9 |\ awk'{lifo [NR] = $0; lno=NR}\ END {for (; lno >-1) {print lifo [lno];}}'
Awk implements head and tail commands
Head: awk'NR filenametail: awk'{buffer [NR] = $0;} END {for
Print the specified column in awk mode:
Ls-lrt | awk'{print $6}'
Cut implementation
Ls-lrt | cut-f6
Print the specified text area to determine the line number
Seq 100 | awk 'NR==4,NR==6 {print}'
Make sure the text prints the text between start_pattern and end_pattern
Awk'/ start_pattern/, / end_pattern/' filenameeg: seq 100 | awk'/ 13 charger awk / etc/passwd / awk'/ mai.*mail/,/news.*news/'
Awk commonly used built-in function index (string,search_string): returns the position where search_string appears in string sub (regex,replacement_str,string): replaces the first regular match with replacement_str; match (regex,string): checks whether regular expressions can match strings; length (string): returns string length
Echo | awk'{"grep root / etc/passwd" | getline cmdout; print length (cmdout)}'
Printf is similar to printf in c language, formatting the output
Seq 10 | awk'{printf "- >% 4s\ n", $1}'
Iterate over the lines, words, and characters in the file. While loop method for each line in the iterative file
While read line;doecho $line;done while read line;doecho $line;done)
Awk method:
Cat file.txt | awk'{print}'
two。 Iterate over each word in a line
For word in $line;do echo $word;done
\ 3. Iterate over each character: extract a character from the string; text slice) {# word}: returns the length of the variable word
For ((iTuno +) doecho ${# word}; iTunes +); done
Thank you for reading! This is the end of this article on "what are the common tools for dealing with text under Linux?". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.