In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Editor to share with you how to use pandas_profiling in python. I hope you will get something after reading this article. Let's discuss it together.
Full picture of the analysis report
What is exploratory data analysis?
Children's shoes who are familiar with pandas probably know the describe () and info () functions of pandas, which are used to view the overall situation of the data, such as average, standard deviation, and so on, which is the so-called exploratory data analysis-EDA.
Introduction to pandas_profiling
If you want to get the full picture of the data more easily and quickly, weeping recommends a python library: pandas_profiling, which only needs one line of code to generate data EDA reports.
Pandas_profiling is based on the DataFrame data type of pandas and can be used for exploratory data analysis simply and quickly.
For each column of the dataset, pandas_profiling provides the following statistics:
1. Summary: data type, unique value, missing value, memory size
2. Quantile statistics: minimum, maximum, median, Q1, Q3, maximum, range, quartile
3. Descriptive statistics: mean, mode, standard deviation, absolute median deviation, coefficient of variation, peak, skewness coefficient
4. The most frequent value, histogram / histogram
5. Visualization of correlation analysis: highlight strongly correlated variables, Spearman, Pearson matrix correlation color scale diagrams
And this report can be exported as HTML, which is very easy to view.
Pandas_profiling installation
Install pandas_profiling can be installed using pip, conda or download files, which is very convenient.
I use the pip method here by typing:
Pip install pandas-profiling
This article carries on the code experiment in Jupyter notebook.
How to use pandas_profiling
1. Load the dataset
I use the classic Titanic data set here:
# Import related library import seaborn as snsimport pandas as pdimport pandas_profiling as ppimport matplotlib.pyplot as plt# to load Titanic dataset data = sns.load_dataset ('titanic') data.head ()
Output:
2. Use pandas_profiling to generate data exploration report
Report = pp.ProfileReport (data) report
Output report:
3. Export to html file
Report.to_file ('report.html') has finished reading this article. I believe you have some understanding of "how to use pandas_profiling in python". If you want to know more about it, please follow the industry information channel. Thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.