In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "how to use Python visualization chart to display the sales of a cosmetics company", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next let the editor to take you to learn "how to use Python visualization map to display the data of a cosmetics company sales situation"!
Business analysis process 1, scenario (diagnostic status quo)
Object: user; sales
Focus: find the growth factors that affect sales
Goal: find the problem & propose a solution
2. Demand disassembly
Analyze sales trends and find goods or regions that affect revenue growth
Monthly sales trend chart (overall)
Comparison of commodity sales (level 1, level 2, find the lowest and highest)
Comparison of regional sales (drill down: regions, provinces, find the lowest and highest)
Explore the sales situation of different commodities and put forward strategic suggestions for the commodity sales of enterprises.
The proportion of sales of each product in different months
Product correlation analysis
Analysis of user characteristics, purchase frequency, retention rate, etc.
Purchase frequency distribution
Repurchase rate (number of repeat users (repeated purchases in both days) / number of users)
Simultaneous group analysis (by month)
3. Code implementation
Get data (excel)
For a cosmetics company from January 2019 to September 2019, daily order detail data and enterprise commodity information data, including two data sheets, sales order table and commodity information table. Among them, the sales order table is the details of each order, an order corresponds to a sale, and an order can contain multiple goods.
Import pandas as pdimport matplotlib.pyplot as pltimport matplotlib as mplmpl.rcParams ['font.family'] =' SimHei'import numpy as npimport warningswarnings.filterwarnings ("ignore") data = pd.read_excel ('C:/Users/cherich/Desktop/ daily chemical .xlsx', encoding='gbk') data.head ()
Data_info = pd.read_excel ('C:/Users/cherich/Desktop/ daily chemical .xlsx', encoding='gbk',sheet_name=' commodity information table') data_info
Data cleaning and processing
Data = data.dropna () # order quantity ends with the characters' data ['order quantity'] = data ['order quantity'] .apply (lambda x:str (x) [:-1] if str (x) [- 1] = = 'else x) data [' order quantity'] = data ['order quantity'] .astype (int) # order quantity ends with the character 'meta' data ['order unit price'] = data ['order unit price'] .apply (lambda x:str (x) [:-1] if str (x) [- 1] = 'yuan' else x) data ['order unit price'] = data ['order unit price'] .astype (int) # date has the special character 2019#3#11def proess_date (df): pos = str (df). Find ('#') if poster =-1: df = str (df). Split ('#') return df [0] +'-'+ df [1] +'-'+ df [2] else: return df# res = proess_date (df = '201931') data ['order date'] = data ['order date'] .apply (proess_date) data ['order date'] = data ['order date'] .apply (lambda x:str (x). Replace ('year') Replace ('month','-') if 'year' in str (x) else x) data ['order date'] = pd.to_datetime (data ['order date']) # data.info () data = data [data.duplicated () = = False] data ['province'] .nunique () data ['month'] = data ['order date'] .apply (lambda x:str (x) .split ('-') [1]) data
Data visualization
# merge data from two tables total_data = pd.merge (data,data_info,on=' item number', how='left') total_data
Groups = data.groupby ('month') x = [each [0] for each in groups] y = [each [1]. Amount .sum () for each in groups] z = [each [1]. Amount. Count () for each in groups] money_mean = data. Amount. Sum () / 9order_mean = data. Amount .count () / 9plt.figure (figsize= (18,10), dpi=80) plt.subplot (221m) plt.plot (x, yPowerWidthreading 2) plt.axvspan ('07mm,' 08mm, color='#EE7621', alpha=0.3) plt.axhline (money_mean, color='#EE7621', linestyle='--',linewidth=1) plt.title ("monthly sales trend", color='#4A708B',fontsize=24) plt.ylabel ("amount / million", fontsize=16) plt.subplot (222m) plt.plot (x Z, linewidth=2, color='# EE7621') plt.axvline ('07 orders, color='#4A708B', linestyle='--',linewidth=1) plt.axhline (order_mean, color='#4A708B', linestyle='--',linewidth=1) plt.title ("monthly order trend", color='#4A708B',fontsize=24) plt.ylabel ("order / (order)", fontsize=16) plt.show ()
The chart shows that, on the whole, sales and orders have risen sharply since April, both above average; since August, they have shown a downward trend and are at the average level.
Groups_category= total_data.groupby (['month', 'commodity category']) category1 = [] category2 = [] for iMagnej in groups_category:# print (iMagnej. Month. Count () if I [1] = 'makeup': category1.append (j. Amount. Sum () else: category2.append (j. Amount .sum () labels = xxticks = np.arange (len (labels)) width = 0.5p = np.arange (len (labels)) fig, ax = plt.subplots (figsize= (18J.8)) rects1 = ax.bar (p-width/2, category1,width, label=' makeup', color='#FFEC8B') rects2 = ax.bar (p + width/2, category2, width, label=' skincare products' Color='#4A708B') ax.set_ylabel ('sales / 100 million') ax.set_title ('monthly sales comparison of skincare products and makeup (large category)') ax.set_xticks (xticks) ax.set_xticklabels (labels) ax.legend () plt.show ()
The chart shows that the demand for skin care products meets most people, which is significantly higher than that of makeup. And May-August is the peak demand for skin care products. Compared with the change in makeup is not obvious.
Groups_categorys= total_data.groupby ('commodity subcategory') x = [each [0] for each in groups_categorys] y = [each [1]. Amount .sum () for each in groups_categorys] fig = plt.figure (figsize= (18pai8), dpi=80) plt.title ('comparison of sales by category', color='#4A708B',fontsize=24) plt.ylabel ('sales (yuan)', fontsize=15) colors = ['# 6699ccccclicense / 4A708BZ] colors = ['# 6699cccclicense / 4A708B / 4A708B' '# FFEC8B'] for I, group_name in enumerate (groups_categorys): lin1 = plt.bar (group_name [0], group_name [1]. Sum (), width=0.8,color=colors [I]) for rect in lin1: height = rect.get_height () plt.text (rect.get_x () + rect.get_width () / 2, height+1, int (height), ha= "center", fontsize=12) plt.xticks (fontsize=15) plt.grid () plt.show ()
The chart shows that the sales of facial mask is the first, followed by facial cream and toner. The lowest sales are honey powder and eye shadow.
Total_data = total_data.dropna () total_data ['region'] = total_data ['region']. Apply (lambda x:str (x). Replace ('Men's District', 'South District'). Replace ('West District', 'West District') groups_area= total_data.groupby (['region', 'Commodity category']) results = {} for iMagazine j in groups_area: money = int (j. Amount .sum () if I [0] in results.keys (): results [I [0]] [I [1]] = money else: results [I [0]] = {} for cate in category_names: results [I [0]] [cate] = 0 results [I [0]] ['lipstick'] = moneyresults= {key_data:list (values_data.values ()) for key_data Values_data in results.items ()} def survey1 (results, category_names): labels = list (results.keys ()) data = np.array (list (results.values ()) data_cum = data.cumsum (axis=1) category_colors = plt.get_cmap ('RdYlGn') (np.linspace (0.15,0.85, data.shape [1]) fig Ax = plt.subplots (figsize= (25jing8) ax.invert_yaxis () ax.xaxis.set_visible (False) ax.set_xlim (0, np.sum (data, axis=1). Max () for I, (colname, color) in enumerate (zip (category_names, category_colors)): widths = data [:, I] starts = data_cum [:, I]-widths ax.barh (labels, widths, left=starts, height=0.5) Label=colname, color=color) xcenters = starts + widths / 2 r, g, b, _ = color text_color = 'white' if r * g * b
< 0.5 else 'darkgrey' for y, (x, c) in enumerate(zip(xcenters, widths)): ax.text(x, y, str(int(c)), ha='center', va='center',color=text_color) ax.legend(ncol=len(category_names), bbox_to_anchor=(0, 1), loc='lower left', fontsize='small') return fig, axsurvey1(results, category_names)plt.show() 图表说明:东部地区占市场份额的35%左右,份额最低的是西部地区。 area_names = list(total_data.商品小类.unique())groups_priv= total_data.groupby(['所在省份','商品小类'])results = {} for i,j in groups_priv: money = int(j.金额.sum()) if i[0] in results.keys(): results[i[0]][i[1]] = money else: results[i[0]] = {} for cate in category_names: results[i[0]][cate] = 0 results[i[0]]['口红'] = moneyresults= {key_data:list(values_data.values()) for key_data,values_data in results.items()}def survey2(results, category_names): labels = list(results.keys()) data = np.array(list(results.values())) data_cum = data.cumsum(axis=1) category_colors = plt.get_cmap('RdYlGn')( np.linspace(0.15, 0.85, data.shape[1])) fig, ax = plt.subplots(figsize=(25,20)) ax.invert_yaxis() ax.xaxis.set_visible(False) ax.set_xlim(0, np.sum(data, axis=1).max()) for i, (colname, color) in enumerate(zip(category_names, category_colors)): widths = data[:, i] starts = data_cum[:, i] - widths ax.barh(labels, widths, left=starts, height=0.5, label=colname, color=color) xcenters = starts + widths / 2 ax.legend(ncol=len(category_names), bbox_to_anchor=(0, 1), loc='lower left', fontsize='small') return fig, axsurvey2(results, area_names)plt.show()The chart shows that Jiangsu ranks first in sales, followed by Guangdong Province, while Ningxia, Inner Mongolia and Hainan have the lowest sales.
Import numpy as npimport matplotlib.pyplot as pltcategory_names = list (total_data. Unique () groups_small_category= total_data.groupby (['month', 'Commodity']) results = {} for iMagin j in groups_small_category: money = int (j. Amount .sum () if I [0] in results.keys (): results [I [0]] [I [1]] = money else: results [I [0]] = {} for cate in category_names: results [I [0]] [cate] = 0 results [I [0]] ['lipstick'] = moneyresults= {key_data:list (values_data.values ()) for key_data Values_data in results.items ()} def survey (results, category_names): labels = list (results.keys ()) data = np.array (list (results.values ()) data_cum = data.cumsum (axis=1) category_colors = plt.get_cmap ('RdYlGn') (np.linspace (0.15,0.85, data.shape [1]) fig Ax = plt.subplots (figsize= (25jing8) ax.invert_yaxis () ax.xaxis.set_visible (False) ax.set_xlim (0, np.sum (data, axis=1). Max () for I, (colname, color) in enumerate (zip (category_names, category_colors)): widths = data [:, I] starts = data_cum [:, I]-widths ax.barh (labels, widths, left=starts, height=0.5) Label=colname, color=color) xcenters = starts + widths / r, g, b, _ = color# text_color = 'white' if r * g * b
< 0.5 else 'darkgrey'# for y, (x, c) in enumerate(zip(xcenters, widths)):# ax.text(x, y, str(int(c)), ha='center', va='center') ax.legend(ncol=len(category_names), bbox_to_anchor=(0, 1), loc='lower left', fontsize='small') return fig, axsurvey(results, category_names)plt.show() 图表说明:眼霜、爽肤水、面膜:4,5,6,7,8月份需求量最大;粉底、防晒霜、隔离霜、睫毛膏、蜜粉1,2,3月份需求量最大。 data_user_buy=total_data.groupby('客户编码')['订单编码'].count()data_user_buyplt.figure(figsize=(10,4),dpi=80)plt.hist(data_user_buy,color='#FFEC8B')plt.title('用户购买次数分布',fontsize=16)plt.xlabel('购买次数')plt.ylabel('用户数')plt.show()The chart shows that most users buy between 10 and 35 times, and very few users buy more than 80 times.
Date_rebuy=total_data.groupby ('customer code') ['order date'] .apply (lambda x:len (x.unique () .rename ('rebuy_count') date_rebuyprint (' repurchase rate:', round (date_ rebuy [date _ date > = 2] .count () / date_rebuy.count (), 4)
Total_data ['time tag'] = total_data ['order date'] .astype (str). Str [: 7] total_data = total_ data [total _ data ['time tag']! = '2050-06'] total_data ['time tag']. Value_counts (). Sort_index () total_data = total_data.sort_values (by=' time tag) month_lst = total_data ['time tag']. Unique () final=pd.DataFrame () final# introduces the time tag for i in range (len (month_lst)-1): # to construct a list as long as the month It is convenient for the following unified format count = [0] * len (month_lst) # to filter out the current month's order, and group target_month = total_ data.loc. by customer nickname [total _ data ['time tag'] = month_lst [I],:] target_users = target_month.groupby ('customer code') ['amount'] .sum (). Reset_index () # if it is the first month Then skip (because there is no need to verify whether it is a new customer with historical data) if iComple0: new_target_users = target_month.groupby ('customer code') ['amount'] .sum () .reset_index () else: # if not Find the previous historical order history = total_ data.loc.isin ['time tag'] .isin (month_lst [: I]),:] # filter out the new customers new_target_users = target_ users.locus [target _ users ['customer code'] .isin (history ['customer code']) = = False :] # the number of new customers in the current month is put in the first value count [0] = len (new_target_users) # in months Loop traversal Calculate the retention for jjjjjjjjjjjjjjjjjjjjct in zip (range (I + 1jjceLen (month_lst)), range (1jceLen (month_lst)): # order for the next month next_month = total_ data.Lok [total _ data ['time tag'] = = month_ LST [j] :] next_users = next_month.groupby ('customer code') ['amount'] .sum (). Reset_index () # calculate the number of customers remaining in the month isin = new_target_users ['customer code'] .isin (next_users ['customer code']) .sum () count [ct] = isin # format transpose result = pd.DataFrame ({month_ LST [I]: count}). T # merges final = pd.concat ([final Result]) final.columns = ['add this month','+ January','+ February','+ March','+ April','+ May','+ June','+ July','+ August'] result = final.divide (final ['add this month'], axis=0). Iloc [:] result ['add this month'] = final ['add this month'] result.round (2) to this I believe that everyone has a deeper understanding of "how to use Python visualization map to display data on the sales of a cosmetics company". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 278
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.