Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

SparkSQL to complete the operation of Hive

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

The next action is: (this operation, the program will be packed into a jar package to run in the cluster)

(1) write spark programs to create tables and import data in hive on the line.

(2) query the data in hive

(3) Save the query results to MySQL

Code:

Object SparkSqlTest {def main (args: Array [String]): Unit = {/ / screen redundant logs Logger.getLogger ("org.apache.hadoop") .setLevel (Level.WARN) Logger.getLogger ("org.apache.spark") .setLevel (Level.WARN) Logger.getLogger ("org.project-spark") .setLevel (Level.WARN) / / build programming entry val conf: SparkConf = new SparkConf () conf.setAppName ("SparkSqlTest") val spark: SparkSession = SparkSession.builder (). Config (conf) .enableHiveSupport () / / this sentence supports hive .getOrCreate () / create sqlcontext object val sqlContext: SQLContext = spark.sqlContext / / create sparkContext val sc: SparkContext = spark.sparkContext / / number of creations According to the library var sql= "" | create database if not exists `test` ".stripMargin spark.sql (sql) / / use the currently created database sql="| use `test`" .stripMargin spark.sql (sql) / / create the stripMargin table Sql= "" | create table if not exists `test`.`cake _ basic` (| name string | | age int, | married boolean | children int |) row format delimited | fields terminated by'| '"" .stripMargin spark.sql (sql) / / load data sql= "" | load data local inpath' file:///home/hadoop/teacher_info.txt' | into table `test`.stripMargin spark.sql (sql) / / execute query operation Sql= "" | select * from `test`.`cake _ basic` "" .stripMargin val hiveDF=spark.sql (sql) val url= "jdbc:mysql://localhost:3306/test" val table_name= "teacher_basic" val pro=new Properties () pro.put ("password") "123456") pro.put ("user", "root") hiveDF.write.mode (SaveMode.Append) .JDBC (url,table_name,pro)}

Run it in the cluster with jar package: https://blog.51cto.com/14048416/2337760

Job submission shell:

Spark-submit\-- class com.zy.sql.SparkSqlTest\-- master yarn\-- deploy-mode cluster\-- driver-memory 512m\-- executor-memory 512m\-- total-executor-cores 1\ file:////home/hadoop/SparkSqlTest-1.0-SNAPSHOT.jar\

Then waiting expectantly for success, unfortunately, when the program was halfway through, it terminated unexpectedly:

I looked at the printed log:

I checked a lot of information on the Internet, all said that the version of hive is too high, what? I'not why!!

Then I thought about it. When I was in the cluster, I used the spark program to operate in the hive table, so whether spark needed to integrate with hive, and then I went online to find out how spark integrates hive. In general, it is to share hive's Metabase so that spark can access it.

Specific operations:

① is added to hive's hive-site.xml:

Where does hive.metastore.uristhrift://hadoop01:9083 # start this process?

② starts the process configured in hive-site.xml on the appropriate node

Nohup hive-- service metastore 1 > / home/hadoop/logs/hive_thriftserver.log 2 > & 1 &

Ps: it should be noted here that nohup is started in the background and all the information is output directionally. After using this command, be sure to check whether the command is really executed successfully:

Use: jsp to check whether there is a corresponding process to start. If it does not indicate that the startup failed, the parent directory / home/hadoop/logs must not be created. After creating this directory, after starting, check whether the startup is successful!

③ copies hive-site.xml to $SPARK_HOME/conf (note that each node is copied)

Whether the ④ test is successful: spark-sql, if you enter correctly and can access the table of hive, it means that spark integrates hive successfully!

After that, I ran the original program again, but no error was reported, and the program ran successfully!

I can't believe it. I checked MySQL's watch again:

Confirm that the program is successful!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report