Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Exploration of how to use one line of Python for data collection

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article to share with you is about how to use a line of Python for data collection exploration, Xiaobian feel quite practical, so share with you to learn, I hope you can read this article after some gains, not much to say, follow Xiaobian to see it.

Simple Pandas Path

Anyone working with Python data will be familiar with Pandas packages. Pandas are go-to packages for most row and column format data. If you don't have Pandas, make sure you install them in your terminal using pip install:

pip install pandas

Now, let's see what the default methods in the Pandas package can do:

Here's what I'm writing to novices who don't know what's going on:

Any Pandas data frame has a. describe () method that returns the output above. However, categorical variables are not noted in this approach. In the example above, the " method " column is omitted entirely from the output.

Let's see if we can solve this problem.

Pandas analysis

What if I told you that Python could produce the following statistics in just three lines? But in fact, if you don't count imports, only one line is enough.

Key Points: Type, Unique Value, Missing Value

Quantile statistics: e.g. minimum, Q1, median, Q3, maximum, range, quartile range

Descriptive statistics: e.g. mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness

usual value

histogram

Correlations of Spearman, Pearson and Kendall matrices with high correlation variables highlighted

Missing Value Matrix, Count, Heat Map and Missing Value Tree

(Feature list directly from Pandas Profiling GitHub)

Well, we can use Pandas Profiling package! To install the Pandas Profiling package, simply use pip install in the terminal:

pip install pandas_profiling

Experienced data analysts may scoff at the looseness of the data or even at first glance at it as "flashy," but it's certainly useful to quickly get a first-hand impression of the data:

The first thing we see is an overview, which provides some very high-level statistics about the data and variables, as well as warnings about high correlations between variables, high skewness, etc.

But that's nothing. Scrolling down we will see that the report has multiple sections, and simply showing the output of this 1-line program in pictures is not enough to fully present them, so I made a gif:

I strongly recommend that you explore the package's features for yourself. After all, this is just one line of code, and this package may be useful for future data analysis.

import pandas as pdimport pandas_profilingpd.read_csv ('https://raw.githuusercontent.com/mwaskom/seaborn-data/master/planets.csv ').profile_report() The above is how to use a line of Python for data collection and exploration. Xiaobian believes that some knowledge points may be seen or used in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report