Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Text processing tools and regular expressions

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

File viewing

Cat

Nl

Tac

Rev

Common options for cat

-E: displays the line Terminator $

-n: number each line displayed

-A: show all control characters

-b: non-blank line number

-s: compress consecutive blank lines into one line

Example:

Cat-E: displays the line Terminator $

Cat-A: show all control characters

Cat-n: number each line displayed, including blank lines

Cat-b: non-blank line number

Cat-s: compress consecutive blank lines into one line (adjacent blank lines into one line)

Tac

Display the file in the opposite direction

Nl

Same effect as cat-b, numbering

Rev

Display the contents of the same line of the file in the opposite direction

View the contents of non-text files

Hexdump

Od

Xxd

View the contents of the file in pages

More

Less

Example:

More: viewing files in paging

More-d: displays page turning and exit prompts

Less: view files or STDIN output page by page

Display the head before or after the text

Tail

Tailf

Head example

The first ten lines are displayed by default

Head-c: display the first x bytes of text

Example: take out the first ten bytes of / etc/passwd file

Head-n (n can be omitted): displays the first x lines of the text

Example: take out the first ten lines of / etc/passwd file

Tail example

The last ten lines of text are displayed by default

Tail-n: line x after the file is displayed

Tail-f: new additions to the tracking file

Tail-F: trace file name

Practice

Find out the local IPv4 address in the result of the ifconfig "network card name" command.

Cut extracts text by column

-d DELIMITER: indicates the delimiter, default tab

-f FILEDS:

#: # th field

#, # [, #]: discrete multiple fields, for example, 1pr 3pr 6

#-#: consecutive fields, such as 1-6

Mixed use: 1-3. 7.

-c cut by character

-- output-delimiter=STRING specifies the output separator

Displays the specified column of file or STDIN data

Cut-d:-F1 / etc/passwd

Cat / etc/passwd | cut-d:-f7

Cut-c2-5 / usr/share/dict/words 

Example

Cut-d-f example: take column 1.3.4 with a colon as a separator

Cut-c: cut by character

Practice

Take out the ip address

Take out the version number

Remove disk space utilization

Find out the permissions of / tmp and display them digitally

Paste merge Files

-d delimiter: specifies the delimiter. Default is TAB.

-s: all lines are displayed in one line

Example

Paste-s:

Example: synthesize a/b.log files into one line display

Wc , a tool for analyzing text

Text data statistics

Sort 

Organize the text

Diff and patch

Compare files

.

Wc

Can be used to count the total number of lines, words, bytes and characters in a file

You can count the data in files or STDIN

Wc story.txt

39 237 1901 story.txt

Line count, word count, byte count 

Common option

-l count rows only

-w only counts the total number of words

-c count only the total number of numeric sections

-m counts only the total number of characters

-L displays the length of the longest line in the file

Example

Wc-l: view only the number of file lines

Wc-w: count only the total number of text words

Wc-L: pick the longest line in the a.log file

Wc-m: count only the total number of characters

Wc-c: count only the total bytes

Sort text sorting

Display the sorted text in STDOUT without changing the original file

Common option

-r perform finishing in the opposite direction (top to bottom)

-R random sort

-n execute sorting by digital size

The-f option ignores the case of characters in the fold string

The-u option (unique, unique) removes duplicate lines from the output

The-t c option uses c as the field delimiter

The-k # option is sorted according to # columns separated by c characters that can be used multiple times

Example

Sort-nr: sort text numbers from large to small

Sort-R: random sort

Example: randomly sort the numbers from 1 to 55

Sort-u: remove duplicate lines

Example: delete duplicate lines of a.log file

Practice

Find out the maximum percentage of zoning space utilization

Find out the user name, UID, and shell type of the user's maximum UID

Uniq removes duplicate row common options from the input

-c: displays the number of repeats per row

-d: only duplicate lines are displayed

-u: only lines that have not been repeated are displayed

Note: the continuous and identical side is repetition.

Often used with the sort command

Sort userlist.txt | uniq-c

Example

For example, viewing the a.log file does not show adjacent duplicate lines

Uniq-c: displays the number of repeats per line

Example: check the number of repetitions per line of an a.log file

Uniq-d: show only adjacent duplicate lines

Example: check the duplicate lines in the a.log file

Uniq-u: only show lines that do not repeat

Example: look at the lines that have not been duplicated in the a.log file

Practice

Count the IP addresses that have accessed the log, and take out the top three with the most visits

Diff compares the differences between two files

-u: the output of the command is saved in a file called "patch".  uses the-u option to output a "unified)" diff format file, which is most suitable for patch files.

.

Patch

-b: copy changes made in other files (use with caution)  applies-b option automatically back up changed files

.

Diff-u foo.conf foo2.conf > foo.patch

Patch-b foo.conf foo.patch

Example

Example: view the difference between a.log and aa.log files

Diff-u: shown in more detail

Example:

Text processing three Musketeers sed

Grep: text filtering (mode: pattern) tool 

Sed:stream editor, text editing tool 

Implementation gawk on awk:Linux, text report generator

Grep

Function: a text search tool that matches the target text line by line according to the "pattern" specified by the user; prints the matching lines

Patterns: filtering conditions written by regular expression characters and text characters

Common option

-- color=auto: shades the matching text with 

Stop  after-m # match # times

-v displays the line  that is not matched by pattern

-I ignore character case 

-n displays the matching line number 

-c Statistics the number of rows matched 

-o display only the matching string 

-Q silent mode, does not output any information 

-A # after, the last # line 

-B # before, the first # lines 

-C # context with # lines of  before and after

-e implements a logical or relationship between multiple options grep-e 'cat'-e 'dog' file 

-w matches the entire word 

-E uses ERE 

-F is equivalent to fgrep and does not support regular expression 

-f file processes according to the schema file

Example

Find the line that contains root from / etc/passwd

Grep-m

Example: filter the first two bash that appear in / etc/passwd

Grep-v

Example: display lines in / etc/passwd file that do not match to bash

Grep-I: ignore case

Grep-n

Example: display the number of lines in the / etc/passwd file that match the root in the file

Grep-c

Example: count the number of lines matching to root in / etc/passwd file

Grep-o

Example: only the bash string matched in the / etc/passwd file is displayed

Grep-Q

For example, no information is output

Grep-A

Example: display the last three lines where the root line is found

Grep-B

Example: displays the first three lines where the root line is found

Grep-C

Example: displays the first three lines and the last three lines where the root line is found

Grep-e

Example: display the lines in the / etc/passwd file that match root or bash

Grep-f

Example: any line in the greo.log file is displayed.

Example:

Regular expression

A pattern written by a class of special characters and text characters, some of which (metacharacters) do not represent the literal meaning of the characters, but represent the function of control or wildcard.

Divided into two categories

Basic regular expression: BRE,grep,vim

Extended regular expressions: ERE,grep-E, egrep,nginx

Practice

Take out the value with the highest partition utilization

Example

Example: search for lines ending in bash

Egrep and extended regular expressions

Egrep = grep-E 

Metacharacters that extend regular expressions: 

Character matching:

. Any single character [] A character in the specified range [^] that is not in the specified range.

Number of times match:

* match the preceding characters any number of times

? 0 or 1 time

One or more times

{m} match m times

At least m, no more than n times

.

Position Anchor:

^ the beginning of the line

$end of line

\,\ b end

.

Grouping:

()

Backward reference:\ 1,\ 2,.

.

Or:

A | b an or b

C | cat C or cat

(C | c) at Cat or cat

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report