Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does Linux filter text or strings in a file

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the knowledge of "how to filter text or strings in a file by Linux". In the operation of actual cases, many people will encounter such a dilemma. Then let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

What is a regular expression?

A regular expression can be defined as a string that represents a sequence of characters. One of its most important functions is that it allows you to filter the output of a command or file, edit text or part of a configuration file, and so on.

The characteristics of regular expressions

Regular expressions are composed of the following:

Ordinary characters, such as spaces, underscores, Amurz, amurz, 0-9.

Can be extended to ordinary characters

Metacharacter

Which include:

(.) It matches any single character except a newline character.

(*) it matches zero or more characters immediately before it.

[character (s)] it matches any character specified by the character / character set, and you can use a hyphen (-) to represent the character range, such as [Amurf], [1-5], etc.

It matches the beginning of a line in the file.

It matches the end of a line in the file.

/ this is an escape character.

You have to use text filtering tools like awk to filter text. You can also think of awk itself as a programming language. But since the scope of this guide is about using awk, I'll follow a simple command-line filtering tool to introduce it.

The general syntax of awk is as follows:

# awk 'script' filename here' script' is a collection of commands that awk can understand and apply to filename.

It works by reading a given line in a file, copying the contents of that line, and executing a script on that line. This process is repeated on all lines in the file.

The format of the content in the script 'script' is' / pattern/ action', where pattern is a regular expression, and action is the action that should be performed when awk finds this pattern in this line.

How to use the awk filtering tool in Linux

In the following example, we will focus on the metacharacters discussed earlier.

A simple example of using awk:

The following example prints all lines in the file / etc/hosts because no mode is specified.

# awk'/ / {print}'/ etc/hosts

Awk prints all lines in the file

Using awk in conjunction with mode

In the following example, the schema localhost is specified, so awk will match those lines with localhost in the file / etc/hosts.

# awk'/ localhost/ {print}'/ etc/hosts

Lines in awk print files that match the pattern

Use the wildcard character (.) in awk mode

In the following example, the symbol (.) The string containing loc, localhost, and localnet will be matched.

The regular expression here means to match l a character c.

# awk'/ l.c/ {print}'/ etc/hosts

Use awk to print strings that match patterns in a file

Use characters (*) in awk mode

In the following example, the string containing localhost, localnet, lines, and capable will be matched.

# awk'/ l*c/ {print}'/ etc/localhost

Use awk to match strings in a file

You may also be aware that (*) will try to match the longest match it can detect.

Let's take a look at an example that proves this. The regular expression tbuttt means to match a string that begins with t and ends with t in the following line:

This is tecmint, where you get the best good tutorials, how to's, guides, tecmint. When you use the mode / t*t/, you will get the following possible results:

The wildcards () of this is t this is tecmint this is tecmint, where you get t this is tecmint, where you get the best good t this is tecmint, where you get the best good tutorials, how t this is tecmint, where you get the best good tutorials, how tos, guides, t this is tecmint, where you get the best good tutorials, how tos, guides, tecmint in / tt/ will cause awk to choose the last item to match:

This is tecmint, where you get the best good tutorials, how to's, guides, tecmint combine the collection [character (s)] to use awk

Taking the collection [al1] as an example, awk will match all strings in the file / etc/hosts that contain the characters an or l or 1.

# awk'/ [al1] / {print}'/ etc/hosts

Use awk to print matching characters in a file

The next example matches the beginning with K or k, followed by a string of T.

# awk'/ [Kk] T / {print}'/ etc/hosts

Use awk to print matching characters in a file

Specify characters in a range

Characters that awk can understand:

[0-9] represents a single number

[amurz] stands for a single lowercase letter

[Amurz] represents a single uppercase letter

[a-zA-Z] stands for a single letter

[a-zA-Z 0-9] represents a single letter or number

Let's look at the following example:

# awk'/ [0-9] / {print}'/ etc/hosts

Use awk to print matching numbers in a file

In the above example, all lines in the file / etc/hosts contain at least one separate number [0-9].

Use awk with metacharacters (^)

In the following example, it matches all lines that begin with a given pattern:

# awk'/ ^ fe/ {print}'/ etc/hosts # awk'/ ^ ff/ {print}'/ etc/hosts

Use awk to print lines that match the pattern

Use awk with metacharacters ($)

It will match all lines that end in a given pattern:

# awk'/ ab$/ {print}'/ etc/hosts # awk'/ ost$/ {print}'/ etc/hosts # awk'/ rs$/ {print}'/ etc/hosts

Use awk to print strings that match the pattern

Use awk with escape characters (/)

It allows you to take the character after the escape character as text, that is, to understand its literal meaning.

In the following example, the first command prints all the lines in the file, and in the second command I want to match the line with $25.00, but I don't use escape characters, so I don't print anything.

The third command is correct because an escape character is used here to escape to recognize it as'(rather than metacharacters).

# awk'/ / {print} 'deals.txt # awk' / $25.00 / {print} 'deals.txt # awk' / / $25.00 / {print} 'deals.txt

That's all for "how Linux filters text or strings in a file". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report