Talking about the realization of transposition function 12/23 Update SLTechnology News&Howtos

Talking about the realization of transposition function

2025-12-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

The scope of application of the static transpose function pivot and unpivot provided by SQL is very limited. In order to realize some complex transpose functions with SQL, we often encounter the problem of too complex sentences, and lack of a standard solution. On the other hand, the SPL language of aggregator can make up for the deficiency of SQL because of its flexible syntax and rich function library.

1. Row transfer column 1. Pivot of the database

Pivot is not a feature that has existed from the beginning, and it is only supported by newer versions of products from major large database manufacturers, such as Oralce 11g or SqlServer2005.

As you can guess from the name, this function is to convert the row to the column, that is, to use the value in the row as the column name. However, the rows and columns of the database are different from ordinary tables, and it is impossible to directly correct the X axis and Y axis. The reason is that the column of the database is unique (that is, the column name cannot be repeated), while the dynamic data is stored in the row, which can be repeated if not as the primary key. Therefore, the practical application of pivot basically follows after the grouping aggregation operation, after each row of data in the column (usually the dimension) used for transposition is processed into a non-repeated value, and then the value of each row is expanded as the column name.

From the perspective of specific applications, the role of pivot is to subdivide the aggregation results of a column into more specific column aggregation results, in order to achieve a more intuitive visual effect.

Is the concept boring and not easy to understand? Let's use a specific example to illustrate, for example, there is a student score table in the Oracle database:

If you want to count the highest scores for each class, the traditional practice is:

Select class, subject, max (grade) highest score from student transcript group by class, subject

The above results can be said to be very bad: first, in the "class" column, the repetition of Class one and Class two makes it easy for people to misread the line; second, in the "subjects" column, Chinese, math and English are all put together. However, in fact, the highest scores of these three subjects do not have any comparative significance.

In fact, we should prefer to see the following results:

In this result, the three subjects in this column are separated as a separate column, which not only reduces useless repetition, but also defines the mutual independence of the highest scores of each subject, which seems to be much clearer.

It can be said that pivot was born for this purpose, and in order to achieve the above results, the query is now written as follows:

Select * from (select class, subject, grade from student transcript) pivot (max (grade) for subject in (highest score in maths as, highest score in English as, highest score in Chinese as))

2. Pivot of the aggregator

Some students may ask, since there is already pivot in the database, why do I still need the pivot of the aggregator?

The answer is: first of all, not all databases provide pivot; secondly, even if all databases provide pivot, but if you want to have a pivot after summarizing the data from multiple databases? That still requires the pivot of the aggregator.

Let's take a look at how to use the pivot of the aggregator

one

= connect ("orcl")

two

= A1.query ("select class, subject, max (grade) highest score from student transcript group by class, subject")

three

= A2.pivot (Class; subject, highest score; "Mathematics": "Mathematics highest score", "English": "English highest score", "Chinese": "Chinese highest score")

Code description:

A1: the first step to connect to the database

A2: the second step is to extract data for preprocessing (this step can be further expanded to do complex calculations such as aggregation or aggregation. For specific methods, please refer to the relevant article)

A3: the third step is to achieve the column conversion function of pivot and present it, and its effect is exactly the same as that of Oracle's pivot.

3. Other meanings of pivot

In addition to the data rendering requirements, the calculation method between columns can also be used after the rows have been converted into columns. Because the properties of columns are different from those of rows, some inter-column calculations are cumbersome to implement between rows. For example, when the school evaluates the class performance, the weights of math, foreign language and language are 0.6, 0.3 and 0.1 respectively, and the evaluation index is calculated by the average score of the three subjects of the two classes:

one

= connect ("orcl")

two

= A1.query ("select class, subject, avg (grade) average score from student transcript group by class, subject")

three

= A2.pivot (Class; subject, average score; "Mathematics", "English", "Chinese")

four

= A3.new (Class, (Mathematics * 0.6 + English * 0.3 + Chinese * 0.1): assessment)

Calculation results:

If the above calculation is to be realized between lines, it will be a lot of trouble. Interested students can try it themselves.

2. Column wrapping 1. Unpivot of database

If there is a row-to-column, naturally there will be a column-to-row. Or take Oracle as an example, which provides a column wrap function called unpivot.

What is the business significance of the function of listing and passing rows? Let's take a look at a personal report card like this:

If you want to know which subject everyone is best at (that is, which subject each person scores highest), it is convenient to use the max function for interrow calculation, while using intercolumn calculation is relatively cumbersome. Then the unpivot function comes in handy:

With T1 as (select * from personal score form unpivot (score for subject in (Mathematics, English, Chinese)), T2 as (name of select, name of max (Grade) Best score from T1 group by name) select T1. Name, T1. Subjects are good at subjects, T2. The best grade is from T1 join T2 on T1. Name = T2. Name and T1. Score = T2. Best score

2. Pivot@r of the aggregator

So, what if the database you are using is not Oracle? Do you still need to study the transpose syntax details of the new database? What if the database does not support transpose statements? Do you need to use case when or subqueries to achieve similar functions indirectly?

Don't be so upset! Because we have an aggregator:

one

= connect ("orcl")

two

= A1.query ("select * from personal score sheet")

three

= A2.pivot@r (name; subject, score; Mathematics: "Mathematics", English: "English", Chinese: "Chinese")

four

= A3.group (name). (~ .top @ 1 (- 1; grade))

five

= A4.new (name, subject: good subject, grade: grade in this subject)

The result of the calculation is the same (there may be a slight difference in sorting):

In addition, it should be noted that the unpivot of the database is not exactly the inverse operation of pivot, because aggregate functions are often included in pivot statements, and the aggregate calculation itself is irreversible, that is, unpivot cannot restore the aggregated results of pivot back to the original detailed data. However, because the pivot of the aggregator does not participate in the aggregate calculation (the aggregate computing has been performed separately before the pivot is executed), the pivot@r of the aggregator can be said to be the inverse of the pivot operation of the aggregator.

III. Two-way transposition

Sometimes more complex transpose operations are required, such as having a student score sheet

We want to get a student score sheet with a structure similar to the following (meaning to look at the changing trend of a student's performance in a subject):

Students

Subject

First semester

The second semester

Zhang San

Mathematics

ninety-nine

eighty-seven

Here, first of all, to merge math, Chinese and other columns into subject columns, we need the operation of column switching, while to split the semester column into semester one and semester two columns, we need the operation of row conversion.

Considering that the structure of the data table is that the number of rows is much greater than the number of columns, so we can advance rows and columns, and then row-to-column. Because the raw data of this table can correspond one to one after column conversion to the data in the table before transformation (no need to calculate aggregation), it is obviously more convenient to use the pivot@r and pivot functions of the aggregator.

one

= connect ("orcl")

two

= A1.query ("select * from Student score sheet")

three

= A2.pivot@r (student, semester; subject, grade)

four

= A3.pivot (student, subject; semester, grade)

Running result

IV. Dynamic column transposition

The examples given above are static transpositions, requiring that the forms and data processed are "well-behaved". But in the actual business, there are always unruly outliers, and I believe that there are a lot of them, so it is a bit inadequate to use SQL, whether it is pivot/unpivot, case when, or anything else. What are we going to do? At this point, the advantages of the aggregator are reflected:

For example, there is a personal income table that records income.

But we want to get a table similar to the following structure

Name income source 1 income amount 1 income source 2 income amount 2.

Zhang San's salary is 8000, stock speculation is 6000.

We are not sure about the number of columns, or even the column names, after the rows are transferred. Instead of using the pivot function that is only suitable for static transpose, you need to use the method of dynamic transposition. The SPL language of the aggregator is much more flexible than the SQL language in terms of dynamic programming:

one

= connect ("orcl")

two

= A1.query ("select * from personal income Table") .group (name)

three

= A2.max (~ .len ())

four

= create (name, ${A3. ("Source of income" + string (~) + ", amount of income" + string (~)) .concat@c ()})

five

For A2

= A5. Name | A5.conj ([source of income, amount of income])

six

A4.record (B5)

The results are as follows:

Fifth, transpose simultaneous column-to-column calculation

Suppose I have a weekly price list of vegetables

And I want to calculate the weekly price trend of all kinds of vegetables, which includes four states: up, down, steady and initial (Monday's value).

The designed table structure is roughly as follows

Vegetables Monday, Tuesday, Wednesday, Thursday...

The empty value of eggplant rises, falls and rises.

The empty value of cucumbers rises.

……

Although the transpose that needs to be used is of a static type, the calculation between columns needs to be implemented during transpose, which is very troublesome for SQL. However, it would be much easier to use the SPL language of the more flexible aggregator:

one

= connect ("orcl")

["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]

two

= A1.query ("select * from vegetable Price list")

three

= create (vegetables, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday)

four

For A2.group (vegetable)

= A4.align (B1, week)

five

= B4. (if (# = = 1: "initial", price > price [- 1]: "rising", price

six

> A3.record (A4. Vegetables | B5)

The weekly price trend of vegetables is as follows

VI. Summary

Compared with pivot and unpivot provided by SQL, the transpose function provided by SPL language of aggregator is more flexible and adaptable, which can meet a variety of complex transposition requirements.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.