Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The simplest method of column-column Transformation in Excel

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Problem description

   often encounters the processing of Excel tables in his work. When editing an Excel table, found that the table has too many columns, and fewer rows, in order to facilitate printing, then you may want to convert the rows and rows of the form; perhaps to do further statistical analysis, the current format is not very convenient, then column conversion will be used.

The interleaved Excel table below in    is a common format that is easy to fill in and view:

  

  , however, if you want to do further statistical analysis, this format is inconvenient and requires column-column conversion to a schedule in the following format:

  

   obviously, manual operation will be very troublesome, if the amount of data is small, a large amount of data will take a lot of time, it is simply a disaster.

   let's take this as an example to illustrate several common solutions.

Solution 1:Excel PivotTable

   Excel can support column-column conversion through PivotTable, as shown below:

  

   obviously, this is not the format we want. The PivotTable of Excel can satisfy the column-column conversion in a simple format, but if the format is slightly complex, the conversion effect is often not satisfactory.

Method 2: programming language

   is solved by writing programs, and the idea is very simple:

   loads the excel file and loads the required sheet worksheet.

   reads the line of the account name and converts it into an array of strings.

   reads the column in which the account Code is located and converts it into a string array.

   is grouped by account Code and constructs a table with the account name array.

According to the data corresponding to the "account name",    traverses all the detailed values and populates them into the corresponding table.

   so that you can construct the corresponding schedule.

If    is implemented in Java, the initial estimated amount of code will not be less than 200 lines, and if the result needs to be output to an excel file, the development effort will be more. Although Excel provides VBA itself, it doesn't matter who knows the extent of the trouble. What about other languages? Legend has it that python can handle row-column conversion (pivot is included in the pandas package). The amount of code will be much less than java. Let's try it:

Import pandas as pd import numpy as np df = pd.read_excel ("D:\\ excel\\ pandas.xlsx", 0,3) cols = df.columns.values.tolist () # get header information # remove the first two columns and keep only the column cols.remove ('account code') cols.remove ('account details') # to construct a list. Frames= [] for col in cols: df1 = df.pivot_table (index = ['account Code', 'account details'], values = [col]) df1.rename (columns= {col: 'numerical'}, inplace=True) df1 [3] = col # the converted data is appended to frames. Frames.append (df1) # concat connects the table of the same field from beginning to end result=pd.concat (frames) result.rename (columns= {3: 'account name'}, inplace=True) result.to_excel ('D:\\ excel\\ pandas_n.xlsx', sheet_name=' account details') import pandas as pd import numpy as np df = pd.read_excel ("D:\\ excel\\ pandas.xlsx", 0 3) cols = df.columns.values.tolist () # get header information # remove the first two columns Only keep the column cols.remove ('account code') cols.remove ('account details') # which requires column-column conversion to construct a list. Frames= [] for col in cols: df1 = df.pivot_table (index = ['account Code', 'account details'], values = [col]) df1.rename (columns= {col: 'numerical'}, inplace=True) df1 [3] = col # the converted data is appended to frames. Frames.append (df1) # concat connects tables of the same field to result=pd.concat (frames) result.rename (columns= {3: 'account name'}, inplace=True) result.to_excel ('D:\\ excel\\ pandas_n.xlsx', sheet_name=' account details')

The effect of    is not bad, it is really simple! This is the excel file generated by Python:

  

However, there is a small problem with   . This excel format is a little special. If we want to use Python's pivot, we need to move the "account code" and "account details" to the same row as the conversion column header and change it to look like the following. Otherwise, there will be special "care" in the code. Anyway, there is only one line, and it is easier to do it by hand than to write the code.

  

   in any case, this small "flaw" in python's handling of detail does not affect its convenience. Python does live up to its reputation, although it uses loops, but the whole code only looks like 10 lines.

Could    be any easier?

   hey hey, yes!

Method 3: aggregator programming

   let's take a look at the code of the aggregator:

AB1=file ("D:/excel/ details .xlsx") .importxls@t ( 1 excel file 2 > A1.delete (A1.select (_ 1 account = "subject code") / / clear line 3 > A1.rename (_ 1: subject code, _ 2: subject details) / / change column 1 name to account code, column 2 name to account detail 4=A1.fname (). To (3,). Concat (",") / concatenate the column names from column 3 into a string. Use, separate 5=A1.pivot@r (subject code, subject details) Account name, value; ${A4}) / convert rows to rows with pivot function 6=file ("D:/excel/ details 2.xlsx") .exportxls @ t (A5; "account details") / / store the sorted data as a xlsx file

The    code is simple. Let's list the intermediate results of each step and see:

   A1: load worksheet 1 of the excel file and extract the specified range of data (from 3 to 40 rows), where the option @ t indicates the first behavior title and loads the data. The table is generated as follows:

  

   A2: delete non-data rows

  

   A3: change column name

  

   A4: concatenate column names starting from column 3 into strings, separated by ","

  

The    A5:pivot function converts the row and column data and places the corresponding column data in A4 into the "numeric" column.

  

   A6: store the sorted data as a xlsx file separately

  

The    aggregator script has only six lines, and there are no loops, judgments and other things, unlike Python, which requires manual inversion to deal with this seemingly "messy" data table. In contrast, Python uses column priority to convert multiple cycle "N" words, and the aggregator uses rows to give priority to one-time processing. In dealing with data, the aggregator is more professional in detail processing and usage habits. And the development environment of the aggregator is also easy to debug, you can see the intermediate results of each step, it is convenient to pick out errors, and the development is more convenient. In this routine data processing task, the aggregator is superior to Python.

Advantage summary

   on this question, about the difference between python and aggregator, and then talk about his own experience:

1. Multi-column conversion

   for scenarios that require multiple columns to be converted and assembled into "long" columns, python needs to construct each data column into an array, add a column to record the current column name, then append it to a large list, and finally merge and remove the title from the non-first array

The    aggregator is easier because it simply aggregates the columns you want to transform. Compared with the tedious python, the aggregator can save at least a few brain cells.

two。 Name change

   python cannot change the name of the column that needs to be converted, such as cols [0] = 'Tianjin'. At this time, python can not find the keyword before modification, "which friend dug the hole, don't think I can't find it", bully me, can you give me an exception?

   but the corresponding aggregator is very convenient, such as: > A1.rename (_ 1: account code, _ 2: subject details, 4 Chengdu: Chengdu)

3. Problem of null value of title

When    Python reads the converted row headings in the excel table, the first two columns are empty (corresponding to "account Code, account details" in the original excel). At this time, the empty value in the title cols is gone. This "pit" is a bit hidden. I really didn't find it humiliating to lose two of the columns.

   but the aggregator can recognize it and will automatically add the corresponding identities _ 1 and _ 2, so that when you process the data, you can find the corresponding two columns.

4. Net format programming

The    aggregator uses the grid A1 format, which is automatically associated with the object in its location, which is very convenient and distinctive; Python can only sigh.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report