Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Python to analyze the heater data of the whole network

2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to use Python to analyze the heater data of the whole network". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "how to use Python to analyze the heater data of the whole network".

Using Python to analyze the data of heater in the whole network

We use Python to obtain the commodity data of Taobao search keywords radiator, heater and wall-mounted stove, and analyze the data.

Read data

First import the acquired data.

# Import toolkits import numpy as np import pandas as pd from pyecharts.charts import Bar, Pie, Map, Pagefrom pyecharts import options as optsimport jieba # read data df_all = pd.read_csv ('.. / data/ export data .csv') df_all.head ()

Df_all.shape (13212, 7) data cleaning and finishing

Here we need to clean the dataset for subsequent analysis and visualization. The main work is as follows:

Delete duplicate values of a record

Goods_price column processing: extracting numeric values

Purchase_num column processing: extracting numeric values

Calculate sales sales_volume = goods_price*purchase_num

Delete redundant columns

The code is implemented as follows:

Df = df_all.copy () # remove duplicate value df.drop_duplicates (inplace=True) df.shape (6849 7) # filter record df = df [df ['purchase_num'] .str.extract (' people pay')] # goods_price column processing df ['goods_price'] = df [' goods_price'] .str.extract ('(\ d +\. {0pr 1}\ d *)') df ['goods_price'] = df [' goods_price'] .astype ('float') # purchase_num column handles df [' num'] = df ['purchase_] Num'] .str.extract ('(\ d +\. {0in i else 1}\ d*)') df ['num'] = df [' num'] .astype ('float') df [' unit'] = [10000 if 'million' in i else 1 for i in df ['purchase_num']] # calculate sales df [' purchase_num'] = df ['num'] * df [' unit'] # calculate sales df ['sales_volume'] = df [' Goods_price'] * df ['purchase_num'] # extract province field df [' province_name'] = df ['location'] .astype (' str'). Str.split (') .apply (lambda xpurx [0]) # remove the redundant column df.drop (['num']) 'unit',' detail_url'], axis=1, inplace=True) # reset index df = df.reset_index (drop=True) df.head ()

You can see "heater" >

Then, I saw that the store ranked Top10 in monthly sales.

Store monthly sales ranking Top10

You can see that the store sales are in the top 10, and the Kerrilli flagship store ranks first. After that, Chunshang Electric Appliance franchise Store and SUNING were the second and third respectively. Also in the top ten are Midea, tcl and other brands.

# calculate top10 store shop_top10 = df.groupby ('shop_name') [' purchase_num'] .sum (). Sort_values (ascending=False) .head (10) ranking the sales volume of each province in China Top10

Where are these heaters made? After analysis, it is found that Zhejiang is the largest province in the production of heaters and ranks first in terms of sales volume in the producing area. After that, Guangdong came in second. Hunan, Jiangsu and Shandong ranked third, fourth and fifth respectively.

# calculate the proportion of sales volume top10province_top10 = df.groupby ('province_name') [' purchase_num'] .sum (). Sort_values (ascending=False) .head (10) in different price ranges

How much are the heaters? After analysis, it is found that the proportion of goods under 100 yuan is as high as 34.76%. Followed by 200-500 yuan of goods, accounting for 22.09%.

Percentage of sales in different price ranges

At the same time, in terms of sales volume, heating products priced below 100 yuan and between 100 yuan and 200 yuan are also the best, accounting for 37.49% and 35.92% of the total network sales respectively.

At this point, I believe you have a deeper understanding of "how to use Python to analyze network-wide heater data". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 252

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report