Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use the extract () and extractall () methods in Pandas

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

How to use extract () and extractall () methods in Pandas. Aiming at this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Series.str.extract (pat, flags=0, expand=None)

Parameters:

Pat: string or regular expression

Flags: integer

Expand: Boolean type, whether to return the data box

Returns:

Data frame dataframe/ Index index

Series.str.extractall (pat, flags=0)

Parameters:

Pat: string or regular expression

Flags: integer

Return value:

DataFrame (data box)

# if there are multiple sets of extracted rule results, a data box will be returned. If there is a mismatch, NaNIn [32]: pd.Series (['a _ 1,'b _ 2,'c _ 3']) .str.extract ('([ab]) (\ d)', expand=False) Out [32]: 0 10 a 11 b 22 NaN NaN# Note any capture group name in the regular expression will be used for the column name Otherwise, the captured group name will be treated as the column name In [33]: pd.Series (['a1','b2','c3']) .str.extract ('(? P [ab]) (? P\ d)', expand=False) Out [33]: letter digit0 a 11 b 22 NaN NaN# parameter expand=True in the case of a set of returned values Return data box In [35]: pd.Series (['a1,'b2, 'c3']) .str.extract (' [ab] (\ d)', expand=True) Out [35]: 00 11 22 NaN# parameter expand=False in the case of a set of return values Return sequence (Series) In [36]: pd.Series (['a1','b2','c3']) .str.extract ('[ab] (\ d)', expand=False) Out [36]: 0 11 22 NaNdtype: when the object# parameter expand=True acts on the index A set of data return data box In [37]: s = pd.Series (["A1", "b2", "c3"], ["A11", "B22", "C33"]) In [38]: sOut [38]: A11 a1B22 b2C33 c3dtype: objectIn [39]: s.index.str.extract ("? P [a-zA-Z])", expand=True) Out [39]: letter0 A1 B2 C# parameter expand=False when acting on the index A set of data returns index In [40]: s.index.str.extract ("(? P [a-zA-Z])", expand=False) Out [40]: Index ([upright Aids, upright bundles, upright C'], dtype='object', name=u'letter') # the following figure shows the index in various cases of expand=False Case where Series returns value 1 group > 1 groupIndex Index ValueErrorSeries Series DataFrame5.# extracts all matching strings # extract returns only the first matched character In [42]: s = pd.Series (["a1a2", "b1", "C1"], index= ["A", "B") "C"]) In [43]: sOut [43]: an a1a2B b1C c1dtype: objectIn [44]: two_groups ='(? P [Amurz]) (? P [0-9])'In [45]: s.str.extract (two_groups) Expand=True) Out [45]: letter digitA a 1B b 1C c 1#extractall will match all returned characters In [46]: s.str.extractall (two_groups) Out [46]: letter digit match A 0 a 1 1 a 2B 0 b 1C 0 c 1 about extract () and Pandas The answer to the question on how to use the extractall () method is shared here. I hope the above content can help you to a certain extent, if you still have a lot of doubts to be solved, you can follow the industry information channel to learn more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report