In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
What are the spark skills, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
1. Set the maximum message size
Def main (args: Array [String]) {System.setProperty ("spark.akka.frameSize", "1024")}
two。 Set up queues when combined with yarn
Val conf=new SparkConf. SetAppName ("WriteParquet") conf.set ("spark.yarn.queue", "wz111") val sc=new SparkContext (conf)
3. The runtime uses yarn to allocate resources and sets the-- num-executors parameter
Nohup / home/SASadm/spark-1.4.1-bin-hadoop2.4/bin/spark-submit--name mergePartition--class main.scala.week2.mergePartition--num-executors 30--master yarnmergePartition.jar > server.log 2 > & 1 &
4. Read the parquet of impala and deal with String string
SqlContext.setConf ("spark.sql.parquet.binaryAsString", "true")
The writing of 5.parquetfile
Case class ParquetFormat (usr_id:BigInt, install_ids:String) val appRdd=sc.textFile ("hdfs://"). Map (_ .split ("\ t")) .map (r = > ParquetFormat (r (0). ToLong,r (1)) sqlContext.createDataFrame (appRdd) .repartition (1). Write.parquet ("hdfs://")
Reading of 6.parquetfile
Val parquetFile=sqlContext.read.parquet ("hdfs://") parquetFile.registerTempTable ("install_running") val data=sqlContext.sql ("select user_id,install_ids from install_running") data.map (t = > "user_id:" + t (0) + "install_ids:" + t (1)) .collect () .foreach (println)
7. When writing a file, gather all the results into one file
Repartition (1)
8. If rdd is reused, use cache cache
Cache ()
9.spark-shell add dependency package
Spark-1.4.1-bin-hadoop2.4/bin/spark-shell local [4]-- jars code.jar
10.spark-shell uses yarn mode and queues
Spark-1.4.1-bin-hadoop2.4/bin/spark-shell-- master yarn-client-- queue wz111, is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.