In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
Grep command
NAME
Grep, egrep, fgrep-print lines that match a given pattern
Overview SYNOPSIS
Grep [options] PATTERN [FILE...]
Grep [options] [- e PATTERN |-f FILE] [FILE...]
Describe DESCRIPTION
Grep searches for file input named FILE (or standard input, if no file name is specified, or the file name given is-
Look for lines that contain content that matches the given pattern PATTERN. By default, grep
The line containing the matching content will be printed.
Alternatively, you can use two variants, egrep and fgrep. Egrep is the same as grep-E. Fgrep is the same as grep-F.
Option OPTIONS
-A NUM,-- after-context=NUM
Print out the following NUM line immediately after the matching line. A line with the content of-- will be printed between adjacent matching groups.
-a,-- text
Treat a binary file as a text file; it is equivalent to the-- binary-files=text option.
-B NUM-- before-context=NUM
The NUM line above before the matching line is printed. A line with the content of-- will be printed between adjacent matching groups.
-C NUM-- context=NUM
Print out the NUM lines before and after the context of the matching lines. A line with the content of-- will be printed between adjacent matching groups.
-b,-- byte-offset
Print the byte offset of the current line in the input file at the same time before each line of the output.
-- binary-files=TYPE
If the first few bytes of a file indicate that the file contains binary data, then the file is assumed to be TYPE
Type. By default, TYPE is binary and grep
An one-line message is usually output saying that a binary file matches, or if there is no match, there is no message output. If the type
TYPE is without-match, so grep assumes that the binaries will not match; this is equivalent to the-I option. If the type
TYPE is text, so grep treats a binary file as a text file; it is equivalent to the-an option. Warning:
Grep-binary-files=text
Binary useless content may be output. If the output device is a terminal, and the driver of the terminal takes some of these outputs as life
Ling, it may bring bad side effects.
-- colour [= WHEN],-- color [= WHEN]
The matching lines are marked with the tokens specified in the GREP_COLOR environment variable. WHEN can be `never', `always'
Or `auto'.
-c,-- count
Disable the usual output; instead, print the total number of matching lines for each input file. If you use-v,-- invert-
The match option (see below) will be the total number of rows that do not match.
-D ACTION-- devices=ACTION
If the input file is a device, FIFO or socket, use the action ACTION
To deal with it. By default, the action ACTION is read, which means that the device will read as if it were a normal file. If the action
ACTION is skip and will skip the device without processing.
-d ACTION,-- directories=ACTION
If the input file is a directory, use the action ACTION to process it. By default, the action ACTION is read
Which means that directories will be read as if they were ordinary files If the action ACTION is skip
The directory will be skipped without processing If the action ACTION is recurse, grep
All files in each directory will be read recursively. This is equivalent to the-r option.
-E-- extended-regexp
Interpret the pattern PATTERN as an extended regular expression (see below).
-e PATTERN,-- regexp=PATTERN
Use the pattern PATTERN as the pattern; useful when protecting patterns that start with -.
-F,-- fixed-strings
Treat the pattern PATTERN as a fixed list of strings, with a new line (newlines)
Separate, as long as it matches one of them.
-P,-- perl-regexp
Interpret the pattern PATTERN as an Perl regular expression.
-f FILE,-- file=FILE
Get the pattern from the file FILE, one per line. The empty file contains 0 patterns, so it doesn't match anything.
-G,-- basic-regexp
Interpret the pattern PATTERN as a basic regular expression (see below). This is the default value.
-H-- with-filename
Print the file name for each match.
-h,-- no-filename
When searching for multiple files, it is forbidden to prefix the file name before the output.
Help outputs a short help message.
-I processes a binary file, but does not consider it to contain matching content. This has to do with-- binary-files=without-match
Options are equivalent.
-I-- ignore-case
Ignore the case difference between the schema PATTERN and the input file.
-L-- files-without-match
Disable normal output; instead, print the name of each input file that does not normally produce output. For each file
The scan stops when it encounters the first match.
-l,-- files-with-matches
Disable normal output; instead, print out the name of each input file that normally produces output. A scan of each file
The trace stops when it encounters the first match.
-m NUM,-- max-count=NUM
After finding NUM
After a matching line, the file is no longer read. If the input is standard input from a normal file and the NUM has been exported
A matching line, grep
Ensure that the standard input is positioned after the last matching line at exit, regardless of whether or not the following line is specified to output. Like this
You can cause a caller to resume the search. When
Grep stops after NUM matching lines, and it outputs any following lines. When using-c or-- count
Option, grep will not output more lines than NUM. When the-v or-- invert-match option is specified
Grep stops after outputting NUM lines that do not match.
-- mmap if possible, use the mmap (2) system call to read the input instead of the default read (2)
System call. In some cases,-- mmap provides better performance. However, if an input file is in grep
The size changes during operation, or if an I _ mmap O error occurs,-- size may cause unknowable behavior
(including core dumps).
-n,-- line-number
Precede each line of the output with its line number in the file in which it is located.
-o,-- only-matching
Only the portion of the matching row that matches the PATTERN is displayed.
-- label=LABEL
Treat the input that actually comes from standard input as coming from the input file LABEL. This is for zgrep
Such tools are very useful, for example: gzip-cd foo.gz | grep-- label=foo something
-- line-buffering
Using line buffering, it can be a performance penality.
-Q,-- quiet,-- silent
Be quiet. Don't write anything to standard output. If you find anything that matches, immediately set the status value to 0
Exit, even if an error is detected. See the-s or-- no-messages option.
-R,-r,-recursive
Recursively read all files in each directory. This is equivalent to the-d recurse option.
-- include=PATTERN
Search recursively in the directory only when searching for files that match PATTERN.
-- exclude=PATTERN
Recursively search the directory, but skip files that match PATTERN.
-s-- no-messages
Forbids the output of error messages that the file does not exist or is unreadable. Note for portability: with GNU grep
Unlike, the traditional grep does not follow the POSIX.2 specification because the traditional grep lacks a-Q option, and its-s
The option behaves similar to the-Q option of GNU grep. Shell scripts that need to be portable to traditional grep should avoid using-Q
And the-s option, instead redirect the output to / dev/null.
-U-- binary
Treat the file as binary. By default, on MS-DOS and MS-Windows systems, grep
Determine its file type by reading the 32kB contents of the header from the file. If grep
Determines that the file is a text file that removes CR characters from the contents of the original file (so that it contains ^ and $)
The regular expression of the Specify-U
This work will not be done, but all files will be read and passed to the matching mechanism. If the file is an CR/LF
A text file that wraps lines, which will cause some regular expressions to fail. This option is available in MS-DOS and MS-Windows
Is not valid in a system other than.
-u,-- unix-byte-offsets
Reports Unix-style byte offsets. This switch causes grep to use the file as a Unix when reporting byte offset
Look at the style of the text file, that is, remove the CR characters. This will result in running grep on a Unix host
Exactly the same result. This option has no effect unless you also use the-b option. This option is available in MS-DOS and MS-Windows
Is not valid in a system other than.
-V-- version
Prints the version number of the grep to standard error output. The version number should be included in all bug reports (see below).
-v,-- invert-match
Change the meaning of the match and select only the rows that do not match.
-W,-- word-regexp
Select only lines that contain matches that make up a complete word. The determination method is that the matching substring must be the beginning of a line, or in a
It can't be after the character of the word. Similarly, it must be the end of a line, or before a character that cannot be a word. The group of words
The characters are letters, numbers, and underscores.
-x,-- line-regexp
Select only matches that match the full row.
A synonym for-y-I, abandoned and disused.
-Z,-- null
Outputs an all-zero byte (the NUL character in the ASCII code) instead of the character normally output after the file name. For example,
Grep-lZ
Outputs an all-zero byte after each file name instead of a normal new line character. This option makes the output clear, even if the file name
The representation contains special characters such as new line characters. This option can be used with the command
Find-print0, perl-0, sort-z, and xargs-0
Used together to deal with any file name, even those that contain new line characters.
Regular expression REGULAR EXPRESSIONS
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed similar to arithmetic expressions, using a variety of operators to join smaller expressions together.
Grep can understand two different versions of regular expression syntax: "basic" and "extended". In GNU grep
There is no difference in the functions that can be achieved by the two grammars. In other implementations, basic
Regular expressions are less expressive. The following description applies to extended (extended)
Regular expressions, the differences between regular expressions and basic regular expressions will be summarized at the end.
The basic building block is a regular expression that matches a single character. Most characters, including all letters and numbers, are regular expressions that match themselves. Any metacharacter with a special meaning can be referenced by preceded by a backslash. (may
Be quoted by preceding it with a backslash.)
A square bracket expression (bracket) is a sequence of characters placed in [and]
In the middle. It matches any character in the sequence; if the first character in the sequence is caret ^, then it does not match in the
Any character in the sequence. For example, the regular expression [0123456789] matches any number.
In square bracket expressions, a range expression (range) consists of two characters, with a hyphen (hyphen) in the middle.
Separate. It matches any character between these two characters, using a localized sequence order and character set. (that sorts between
The two characters,inclusive, using the locale's collating sequence and character set.)
For example, in the default C locale, [amurd] is equivalent to [abcd]. Typically, many locale
Sorts characters in lexicographical order, in which [Amurd] is not equivalent to [abcd]; for example, it may be equivalent to [aBbCcDd]
Equivalent. To get the interpretation of traditional square bracket expressions, you can set the environment variable LC_ALL value to C to use locale C.
Finally, there are some predefined character classes in the square bracket expression, as shown below. Their names are self-explanatory, and they are
[: alnum:] (letters and numbers), [: alpha:] (letters), [: cntrl:] (), [: digit:] (numbers), [: graph:] ()
[: lower:] (lowercase letters), [: print:] (printable characters), [: punct:] (), [: space:] (spaces), [: upper:] (uppercase letters)
And [: xdigit:] For example, [[: alnum:]] means [0-9A-Za-z], but the latter representation depends on locale C.
And ASCII character encoding, while the former is associated with locale
Independent of the character set. Note that the square brackets in these character class names are also part of the symbol name and must be included in the square brackets used to delimit the sequence
In the middle.)
Most metacharacters lose their special meaning when they are in a sequence. To include a literal (literal)]
You need to put it at the top of the sequence Similarly, in order to contain a literal (literal) ^
You need to put it in a position other than the front of the sequence Finally, in order to include a literal (literal)
You need to put it at the end of the sequence
Period symbol (period). Matches any character. The symbol\ w is synonymous with [[: alnum:]], and\ W is [^ [: alnum]]
A synonym for.
The delimited character (caret) ^ and the dollar mark (dollar) $are metacharacters that match the empty string at the beginning and end of a line, respectively. Symbol\
< 和 \>The metacharacters that match the empty string at the beginning and the end of a word, respectively. The symbol\ b matches an empty string on the edge of a word (edge).
\ B matches an empty string that is not on the edge of a word.
A regular expression can be followed by one of several repeating operators.
? The previous item is optional and can be matched at most once.
* previous items can match zero or more times.
+ previous items can match one or more times.
{n} previous items will match exactly n times.
{n,} previous items can match n or more times.
The previous entry will match at least n words, but not more than m times.
Two regular expressions can be concatenated together; the resulting regular expression can match any string that is concatenated by a substring of two subexpressions that match the subexpression before the join.
Two regular expressions can use the infix operator |
Joined together, the resulting regular expression can match any string that matches any of the subexpressions before the union.
The repeating operator has a higher priority than the connection, and then higher priority than the selected one. A complete subexpression can be in parentheses (parenthe-
Ses) to surpass these priority rules. (to override these precedence rules.)
In the backreference\ nn is a number that matches the regular expression with the nth
A substring that has been matched by a subexpression enclosed in parentheses.
In basic regular expressions, metacharacters?, +, {, |, and) lose their special meaning; instead, use backslash
(backslash) version\?,\ +,\ {,\ |,\ (, and\).
Traditional egrep does not support metacharacters {, and some egrep implementations support\ {
To replace it, portable scripts should avoid using the {pattern in egrep and should use [{] to match a literal meaning
(literal) {.
GNU egrep assumes that if {is in an invalid interval specification
Is not a special character to support traditional usage. For example, the shell command egrep'{1'
The two-character string {1 will be searched instead of reporting a syntax error in the regular expression. POSIX.2
Allow this behavior as an extension, but portable scripts should avoid using it.
Environment variable ENVIRONMENT VARIABLES
The behavior of Grep is affected by the following environment variables.
A locale LC_foo is done by LC_ALL, LC_foo, LANG in the following order
Determined by checking the values of these three environment variables. The first variable that is set specifies locale. For example, if LC_ALL
Not set, but LC_MESSAGES is set to pt_BR, then Brazilian Portuguese (Brazilian Portuguese) will be used as
The value of LC_MESSAGES locale. If you do not set any of these environment variables, or if you do not install the set locale
Directory, or if grep does not compile country and language support (national language support (NLS)), it will default to the
Locale C .
GREP_OPTIONS
This variable specifies the default option that will be placed before all explicitly specified options. For example, if GREP_OPTIONS is
-- binary-files=without-match-- directories=skip', grep
Will be as if-- binary-files=without-match and-- directo- have been specified before any explicitly specified options
The ries=skip option works that way. Options are separated by blank (whitespace). A backslash (backslash)
Causes the next character to be escaped (escape), so it can be used to specify an option with whitespace or backslash.
GREP_COLOR
Specifies the tag used to highlight.
LC_ALL, LC_COLLATE, LANG
These variables specify locale LC_COLLATE, which determines the sequence order (col-) when interpreting range expressions like [a Murz]
Lating sequence).
LC_ALL, LC_CTYPE, LANG
These options specify locale LC_CTYPE and determine the type of characters, for example, which characters are blank (whitespace).
LC_ALL, LC_MESSAGES, LANG
These options specify locale LC_MESSAGES, which determines the language used by grep's messages. Default locale C
News in American English.
POSIXLY_CORRECT
If set, grep will operate as required by POSIX.2; otherwise, grep will be like other GNU
The program works the same way. POSIX.2
Requires that the options after the file name be treated as file names; by default, these options are swapped to the front of the list of operands and are treated as selections
Item to deal with. meanwhile,
POSIX.2 requires that unrecognized options be expressed as "ille-" in the diagnostic message
Gal ", but since they do not really break the law, they are represented by default in the diagnose message as
"invalid". POSIXLY_CORRECT also forbids _ NumberGNU nonoptionalism argvails _ described below.
_ N_GNU_nonoption_argv_flags_
(where N is the process ID in the digital form of grep's.) If the first character of the value of this environment variable is 1, do not set the
The first Operand of a grep is treated as an option, even if it looks like it. Shell
You can set this variable in the environment of each command it runs, specifying which Operand is the result of the filename wildcard extension, so
Should not be considered an option. This behavior can only be done when using the
The GNU C library is valid and only if POSIXLY_CORRECT is not set up.
Diagnosing DIAGNOSTICS
In general, if the selected row is found, the status value on exit is 0, otherwise it is 1. However, if an error occurs, the status value on exit is 2, unless specified
Was
-Q or-- quiet or-- silent option, and the selected row is found.
BUGS
The email address of the bug report is bug-gnu-utils@gnu.org. Be sure to include the word "grep" in "Subject:".
Too many times of repetition in the {n ·m} structure will result in grep.
Use a lot of memory. In addition, some overly obscure regular expressions require exponential time and space, which may lead to grep
Use up all memory.
Backward referencing (backreferences) is very slow and may take exponential time.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.