Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement exploratory data Analysis with Sweetviz in Python

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

Editor to share with you how to achieve exploratory data analysis in Sweetviz in Python. I hope you will get something after reading this article. Let's discuss it together.

Sweetviz is an open source Python library that can generate beautiful high-precision visualization to launch EDA (exploratory data analysis) in just three lines of code. Output a HTML. At the end of the article to provide technical exchange group, like to like support, collection.

As shown in the above picture, it can not only analyze the data longitudinally according to different columns such as gender and age, but also make horizontal comparisons such as mode, maximum and minimum for each column.

All input numerical and text information will be automatically detected, and data analysis, visualization and comparison, and finally automatically help you to summarize, is a good helper of exploratory data analysis.

1. Prepare for

Please choose any of the following ways to enter the command to install dependencies:

1. Windows environment opens Cmd (start-run-CMD).

2. Open Terminal in MacOS environment (enter Terminal in command+ space).

3. If you are using a VSCode editor or Pycharm, you can use Terminal directly at the bottom of the interface.

Basic usage of pip install sweetviz2.sweetviz

The principle used by sweetviz is to generate a data report object using a single line of code (where my_dataframe is DataFrame in pandas, a tabular data structure):

Import pandas as pdimport sweetviz as sv# read data my_dataframe = pd.read_csv ('.. / ImpartData/iris.csv') # Analytical data my_report = sv.analyze (my_dataframe) # generate report my_report.show_html ()

When the execution is complete, a report file for HTML will be generated under the current folder

Double-click the html and you will see a beautiful analysis report:

Among them, there are three functions available for analyzing data, in addition to the analyze function mentioned above, there are also compare and compare_intra functions.

The first is the analyze function:

Analyze (source: Union [pd.DataFrame, Tuple [pd.DataFrame, str]], target_feat: str = None, feat_cfg: FeatureConfig = None, pairwise_analysis: str = 'auto')

It can be seen that it can be configured with the following four parameters:

Source: take the DataFrame data structure in pandas as the analysis object.

Target_feat: a string that needs to be marked as the target object.

Feat_cfg: a feature that needs to be skipped or cast to a data type.

Pairwise_analysis: correlation analysis may take a long time. If you exceed your tolerance, you need to set this parameter to on or off to determine whether the data correlation needs to be analyzed.

Compare ()? comparison of two datasets

My_report = sv.compare ([my_dataframe, "Training Data"], [test_df, "Test Data"], "Survived", feature_config)

To compare two datasets, simply use the compare () function. It has the same parameters as analyze (), except that a second parameter is inserted to overwrite the comparison data frame. It is recommended to use the [dataframe, "name"] parameter format to better distinguish between basic data frames and comparative data frames. (for example, [my_df, "Train"] is better than my_df)

Compare_intra ()? column comparison of data sets

My_report = sv.compare_intra (my_dataframe, my_dataframe ["Sex"] = = "male", ["Male", "Female"], feature_config)

If you want to analyze the parameters under a column in the dataset, use this function.

For example, you can use this function if you need to compare "male" and "female" under the "gender" column.

3. Adjust report layout

Once you have created your report object, simply pass it to one of the two show functions:

1. Show_html (): show_html (filepath='SWEETVIZ_REPORT.html', open_browser=True, layout='widescreen', scale=None)

* * show_html (…) * * the HTML report will be created and saved in the current file path. There are the following parameters:

Layout: whether it's' widescreen''or 'vertical'. As the mouse moves over each function, the widescreen layout displays details on the right side of the screen. The new (starting with 2.0) vertical layout is more compact horizontally and can expand each detail area as you click.

Scale: use floating-point numbers (scale=0.8 or None) to scale the entire report.

Open_browser: enables the automatic opening of the Web browser to display the report. If you don't need it, you can disable it here.

2.show_notebook (): show_notebook (w=None, h=None, scale=None, layout='widescreen', filepath=None)

It will embed an IFRAME element to display the report in the notebook (for example, Jupyter, Google Colab, and so on).

Note that since Notebook is usually a more restricted environment, it might be a good idea to use custom width / height / scale values (w, h, scale). The options are:

W (width): sets the width of the report output window. It can be a percentage string (w = "100%") or pixels (walled 900).

H (height): sets the height of the report output window. It can be the number of pixels (hpixels 700) or stretch the window as high as all features (h = "full").

Scale: same as the show_html above.

Layout: same as the show_html above.

Scale: same as the show_html above.

Filepath: optional output HTML report.

After reading this article, I believe you have a certain understanding of "how Sweetviz implements exploratory data analysis in Python". If you want to know more about it, you are welcome to follow the industry information channel. Thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report