The implementation of Hive Row-to-column 11/17 Update SLTechnology News&Howtos

The implementation of Hive Row-to-column

2025-11-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly talks about "the implementation of Hive row transfer". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Now let the editor to take you to learn "the implementation of Hive row transfer"!

Preface

In traditional relational database, no matter Oracle (after 11g) or SQLserver (after 2005), Pivot function is provided to realize row-to-column function. This paper mainly describes two ways to realize row-to-column function in Hive.

Traditional database mode

This approach is based on the way Oracle or SQLserver implemented row swapping before supporting the Pivot function. In fact, there is no change in syntax, only Hive.

With testtable (

Select 1 id,'k1' key,123 value

Union all

Select 1 minute K2' key,124 value

Union all

Select 2, K1, 2, 3, 3, 4, 2, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 3, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,

)

Select id

Max (case when key='k1' then value else null end) K1

Max (case when key='k2' then value else null end) K2

From testtable

Group by idMap mode

The idea of this approach is to splice the fields that need to be transferred and their value fields into the map data of key-value on Hive. The specific Sql is as follows:

With testtable (

Select 1 id,'k1' key,123 value

Union all

Select 1 minute K2' key,124 value

Union all

)

Select id,kv ['K1'], kv [' K2']

From (

Select id,str_to_map (concat_ws (',', collect_set (concat (key,'-', value),'-') kv

From testtable

Group by id) t Summary

Both methods can achieve the function of row-to-column, the traditional way is easy to understand, but when there are many values of key, it will be quite tedious to write, and the way of map is relatively simple. But the map way all the data are collect together, the memory requirements will be higher, while the traditional way through the aggregate function directly reduce, you can run while evaluating.

At this point, I believe that you have a deeper understanding of "the implementation of Hive row transfer". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.