In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "Python data analysis and machine learning example code analysis". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "Python data analysis and machine learning example code analysis".
Why choose Python for data analysis?
Python is not only a dynamic, object-oriented scripting language, but also a simple, easy-to-understand programming language. Python is easy to start, the code is readable, a good Python code, read like reading a foreign language article. Python, a feature called "pseudocode", allows you to focus only on what tasks are done, rather than obsessing about the syntax of Python.
In addition, Python is open source, it has a lot of excellent libraries, can be used in data analysis and other areas. More importantly, Python has good compatibility with Hadoop, the most popular open source big data platform. Therefore, learning Python is a very cost-saving thing for data analysts who are interested in developing an analytical position for big data.
Many advantages of Python make it one of the most popular programming languages, and many companies at home and abroad have already used Python, such as YouTube,Google, Aliyun and so on.
Simple and useful Python data analysis and machine learning code
After this month's python data analysis and machine learning, summed up some experience, but also gained some excellent blogs, interested can watch my favorites, nonsense, directly to the point.
Data analysis is roughly divided into three parts: data processing, model building and model testing. This article mainly explains how to deal with data.
In order to analyze the data, we must first learn about python's panda library pandas. Here are some basic and simple operation methods. The python call method is as follows
Import pandas as pd
The method of reading csv file by python through pandas
Df= pd.read_csv ("xxx.csv") # output file contents the first five columns print (df.head ()) # output csv all content print (df)
The method of viewing a column of data in csv
Pandas.read_csv ('file_name.csv ", usecols = [0meme 1Jet 2Jue 3]) # simple method df [" attribute column name "]
The method of deleting some columns of csv data by pandas
Droplabels= ["x_cat4", "x_cat5", "x_cat8", "x_cat9"] data=df.drop (droplabels,axis=1)
The method of cleaning NAN data by pandas
# Delete the column containing nan value in the table and return Seriesdf.dropna () "" dropna (axis=0,how= "any", thresh=None) with non-empty data and index value. The optional value for how parameter is any or all.all discards the row (column) only if the slice element is all NA. Thresh is an integer type, eg:thresh=3, so it is retained only if there are at least three na values in a row. "" Data.fillna (0) # replace nan with 0print (data.fillna (data.mean () # fill the missing data print (data.fillna (data.median () with the mean of each column of features # fill the missing data print (data.fillna (method= "bfill")) with the median of each column of features # fill it with adjacent trailing (back) features Fill the blank value in front of print (data.fillna (method= "pad")) # fill in the empty value behind with adjacent front features # refer to blog: https://blog.csdn.net/qq_21840201/article/details/81008566
The method of changing the data of csv file by pandas
# change a column attribute value and type df = df [df ["ups and downs"]! = "None"] df ["ups and downs"] = df ["ups and downs"] .astype (np.float64) df = pd.DataFrame (a, dtype= "float") # data type conversion # reference link: http://www.45fan.com/article.php?aid=19070771581800099094144284# reads and changes all data traverses Refer to the following for i in df.index: df ["id1"] [I] = 1
The usage and function of iloc of pandas
X = df.iloc [:, data.columns! = "label"] # take out the other columns df.iloc [: 3,: 2] # using .iloc, we only select the first three rows and two columns of .iloc
The method of calculating the number of elements in a column
Sum= len (data [data.label = = "BENIGN"]) # calculate the number of BENIGN len (df)
The method of saving files by pandas
# df is the data to be saved, and xxx.csv is the saved file df.to_csv ("xxx.csv", index=False, sep= ",") Thank you for reading. The above is the content of "Python data Analysis and Machine Learning example Code Analysis". After the study of this article, I believe you have a deeper understanding of Python data analysis and machine learning example code analysis, and the specific usage needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 266
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.