In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Today, I would like to share with you the practical skills of Python data processing, which are relevant knowledge points, detailed content and clear logic. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article.
The version of Pandas I use is as follows, which is also imported into the Pandas library by the way.
> > import pandas as pd > pd.__version__'0.25.1'
Make sure that the interpreter and the dataset are in the same directory before you begin:
> import os > os.chdir ('drinksbycountry.csv', IMDB-Movie-Data.csv',' movietweetings', titanic_eda_data.csv', 'dataset') # this is the directory where my dataset is located > titanic_eda_data.csv', () # confirm that the dataset already exists in this directory ['drinksbycountry.csv',' dataset 'movietweetings',' source 'dataset]
After the preparatory work is in place, we will officially begin the journey of data processing skills.
1 Pandas removes a column
Import data
> df = pd.read_csv ("IMDB-Movie-Data.csv") > df.head (1) # Import and display the first line Rank Title Genre. Votes Revenue (Millions) Metascore0 1 Guardians of the Galaxy Action,Adventure,Sci-Fi... 757074 333.13 76.0 [1 rows x 12 columns]
Use the pop method to remove the specified column:
> meta = df.pop ("Title") .to_frame () # remove Title column
Confirm that it has been removed:
> df.head (1) # df becomes 11-column Rank Genre. Revenue (Millions) Metascore0 1 Action,Adventure,Sci-Fi... 333.13 76.0 [1 rows x 11 columns] 2 count the number of words in the title
After pop, you get meta, which displays the first three lines of meta:
> meta.head (3) Title0 Guardians of the Galaxy1 Prometheus2 Split
The title is made up of words, separated by spaces.
# .str.count ("") + 1 get the number of words > meta ["words_count"] = meta ["Title"] .str.count (") + 1 > meta.head (3) # words_count column represents the number of words Title words_count0 Guardians of the Galaxy 41 Prometheus 12 Split 13 Genre frequency statistics
The frequency of the movie Genre is counted below.
> vc = df ["Genre"] .value_counts ()
The Top5 of the movie Genre is shown below. The highest frequency is the Action,Adventure,Sci-Fi class with 50 occurrences, followed by the Drama class with 48 occurrences:
> vc.head () Action,Adventure,Sci-Fi 50Drama 48 Comedy 31Name: Genre, dtype: int64
Show the pie chart of Top5:
> > import matplotlib.pyplot as plt > vc [: 5] .plot (kind='pie') > plt.show ()
These are all the contents of this article entitled "what are the practical skills of Python data processing?" Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.