Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the practical skills of Python data processing

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Today, I would like to share with you the practical skills of Python data processing, which are relevant knowledge points, detailed content and clear logic. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article.

The version of Pandas I use is as follows, which is also imported into the Pandas library by the way.

> > import pandas as pd > pd.__version__'0.25.1'

Make sure that the interpreter and the dataset are in the same directory before you begin:

> import os > os.chdir ('drinksbycountry.csv', IMDB-Movie-Data.csv',' movietweetings', titanic_eda_data.csv', 'dataset') # this is the directory where my dataset is located > titanic_eda_data.csv', () # confirm that the dataset already exists in this directory ['drinksbycountry.csv',' dataset 'movietweetings',' source 'dataset]

After the preparatory work is in place, we will officially begin the journey of data processing skills.

1 Pandas removes a column

Import data

> df = pd.read_csv ("IMDB-Movie-Data.csv") > df.head (1) # Import and display the first line Rank Title Genre. Votes Revenue (Millions) Metascore0 1 Guardians of the Galaxy Action,Adventure,Sci-Fi... 757074 333.13 76.0 [1 rows x 12 columns]

Use the pop method to remove the specified column:

> meta = df.pop ("Title") .to_frame () # remove Title column

Confirm that it has been removed:

> df.head (1) # df becomes 11-column Rank Genre. Revenue (Millions) Metascore0 1 Action,Adventure,Sci-Fi... 333.13 76.0 [1 rows x 11 columns] 2 count the number of words in the title

After pop, you get meta, which displays the first three lines of meta:

> meta.head (3) Title0 Guardians of the Galaxy1 Prometheus2 Split

The title is made up of words, separated by spaces.

# .str.count ("") + 1 get the number of words > meta ["words_count"] = meta ["Title"] .str.count (") + 1 > meta.head (3) # words_count column represents the number of words Title words_count0 Guardians of the Galaxy 41 Prometheus 12 Split 13 Genre frequency statistics

The frequency of the movie Genre is counted below.

> vc = df ["Genre"] .value_counts ()

The Top5 of the movie Genre is shown below. The highest frequency is the Action,Adventure,Sci-Fi class with 50 occurrences, followed by the Drama class with 48 occurrences:

> vc.head () Action,Adventure,Sci-Fi 50Drama 48 Comedy 31Name: Genre, dtype: int64

Show the pie chart of Top5:

> > import matplotlib.pyplot as plt > vc [: 5] .plot (kind='pie') > plt.show ()

These are all the contents of this article entitled "what are the practical skills of Python data processing?" Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report