Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use hbase-rdd to expand your own module on Spark Core

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "how to use hbase-rdd to expand your own module on Spark Core", the content is simple and easy to understand, clear organization, I hope to help you solve doubts, let Xiaobian lead you to study and learn "how to use hbase-rdd to expand your own module on Spark Core" this article bar.

hbase-rdd is a third-party open source module built on SparkContext for adding, deleting and checking Hbase. Currently, the *** version is 0.7.1. Currently the rdd calls implicit methods by default when operating on hbase.

implicitdef stringToBytes(s: String): Array[Byte] = { Bytes.toBytes(s) }

Convert the key of the RDD to byte b, then call the put(b) method of Hbase to save rowkey, and then store each row of the RDD into hbase.

In the calculation of the trajectory graph drawing project data, we consider the design of the rowkey of hbase--minimizing the cost of rowkey storage. Although hbase-rdd's final rowkey defaults to byte arrays, here we want to assemble rowkeys our own way. Use MD5(imei)+dateTime as rowkey. So the default hbase-rdd provides methods that don't meet our storage requirements and require modification of the source code. Within the toHbase method, there is a convert method that will convert each row of data in the RDD, using the key in the RDD to generate a Put(Bytes.toBytes(key)) object that provides the rowkey for later storage of the Hbase.

In the convert function, its implementation has been modified. hbase-rdd uses the stringToBytes implicit function by default to convert the String type key of RDD into a byte array. Here we need to modify it so that the stringToBytes implicit method is not used, but byte data is generated directly.

protected def convert(id: String, values: Map[String, Map[String, A]], put: PutAdder[A]) = { val strs = id.split(",") val imei = strs {0} val dateTime = strs {1} val b1 = MD5Utils.computeMD5Hash(imei.getBytes()) val b2 = Bytes.toBytes(dateTime.toLong) val key = b1.++ (b2)val p = new Put(key)//modify var empty = true for { (family, content)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report