Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use parse, a Python text parsing library that does not need to use regularities

2025-04-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "how to use the regular Python text parsing library parse". In the daily operation, I believe many people have doubts about how to use the regular Python text parsing library parse. The editor looked up all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts of "how to use the regular Python text parsing library parse". Next, please follow the editor to study!

1. Real case

Take a recent real case of using parse as an example.

Here is a bar table of ovs. Now I need to collect and extract how much traffic and packets flow through a virtual machine (network port). That is, the values of n_bytes and n_packets corresponding to each in_port.

Cookie=0x9816da8e872d717d, duration=298506.364s, table=0, n_packets=480, n_bytes=20160, priority=10,ip,in_port= "tapbbdf080b-c2" actions=NORMAL

What would you do if it were you?

First separated by commas, and then separated by equal signs to take out the values?

You can give it a try, the code should be the same as I imagined, without a trace of beauty.

Let me show you how I did it.

As you can see, I used a third-party package called parse, which needs to be installed by myself.

$python-m pip install parse

From the above case, you should feel that parse is very powerful for parsing canonical strings.

2. Results of parse

There are only two results of parse:

1. No match. The value of parse is None.

Parse ("halo", "hello") is NoneTrue >

If there is a match, the value of parse is Result instance

> > parse ("hello", "hello world") > parse ("hello", "hello") >

If you write a resolution rule that does not define a field name, that is, a hidden field, Result will be an example similar to list, as shown below:

> profile = parse ("I am {}, {} years old, {}", "I am Jack, 27 years old, male") > profile > profile [0] 'Jack' > profile [1]' 27'> profile [2] 'male'

If you write a parsing rule that defines a field name for a field, Result will be an example of a dictionary, as shown below:

> profile = parse ("I am {name}, {age} years old, {gender}", "I am Jack, 27 years old, male") > > profile > profile ['name']' Jack' > profile ['age']' 27'> > profile ['gender']' male'3. Reuse pattern

Parse supports pattern reuse as well as using re.

> from parse import compile > pattern = compile ("I am {}, {} years old, {}") > pattern.parse ("I am Jack, 27 years old, male") > pattern.parse ("I am Tom, 26 years old, male") 4. Type transformation

From the above example, you should notice that when parse gets the age, it becomes a "27", which is a string, is there a way to convert it according to our type at the time of extraction?

You can write like this.

> from parse import parse > profile = parse ("I am {name}, {age:d} years old, {gender}", "I am Jack, 27 years old, male") > > profile > type (profile ["age"])

Is there any other format besides converting it to an integer?

There are many built-in formats, such as

Matching time

> parse ('Meet at {: tg}', 'Meet at 1, 2, PM', 11:00, 2011)

For more types, please refer to the official documentation:

TypeCharacters MatchedOutputlLetters (ASCII) strwLetters, numbers and underscorestrWNot letters, numbers and underscorestrsWhitespacestrSNon-whitespacestrdDigits (effectively integer numbers) intDNon-digitstrnNumbers with thousands separators (, or.) int%Percentage (converted to value/100.0) floatfFixed-point numbersfloatFDecimal numbersDecimaleFloating-point numbers with exponent e.g. 1.1e-10, NAN (all case insensitive) floatgGeneral number format (either d For e) floatbBinary numbersintoOctal numbersintxHexadecimal numbers (lower and upper case) inttiISO 8601 format date/time e.g. 1972-01-20T10:21:36Z ("T" and "Z" optional) datetimeteRFC2822 e-mail format date/time e.g. Mon 20 Jan 1972 10:21:36 + 1000datetimetgGlobal (day/month) format date/time e.g. 20 Nov 1max 10:21:36 AM + 1:00datetimetaUS (month/day) format date/time e.g. 1 format date/time 10:21:36 PM + 10:30datetimetcctime () 1973datetimethHTTP log format date/time e.g. Sun Sep 16 01:03:52 1973datetimethHTTP log format date/time e.g. 21/Nov/2011:00:07:11 + 0000datetimetsLinux system log format date/time e.g. Nov 9 03:37:44datetimettTime E.g. 10:21:36 PM-5:30time5. Remove spaces during extraction

Remove spaces on both sides

> parse ('hello {}, hello python',' hello world, hello python') > parse ('hello {: ^}, hello python',' hello world, hello python')

Remove the left space

> parse ('hello {: >}, hello python',' hello world, hello python')

Remove the right space

> parse ('hello {:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report