In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Error one: background
Query the table information in hive after starting spark-shell and report an error
$SPARK_HOME/bin/spark-shellspark.sql ("select * from student.student") .show () log Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance (MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaSto Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. Reason:
When spark accesses the mysql database of Metastore where hive is stored, the connection is not successful because spark does not have a jar package without mysql-connecter, that is, the mysql driver
Resolve:
Some people say, then just cp the jar package under $SPARK_HOME/jars. Sorry, this is absolutely not possible in reproduction. Not all spark programs will use the mysql driver, so we have to specify-- jars when submitting the job. Multiple jar packages are separated by commas (my mysql version is 5.1.73).
[hadoop@hadoop003 spark] $spark-shell-- jars ~ / softwares/mysql-connector-java-5.1.47.jar 19-05-21 08:02:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicableSetting default log level to "WARN". To adjust logging level use sc.setLogLevel (newLevel). For SparkR, use setLogLevel (newLevel). Spark context Web UI available at http://hadoop003:4040Spark context available as' sc' (master = local [*] App id = local-1558440185051). Spark session available as' spark'.Welcome to _ / / _ / / _\ / _ _ / `/ _ _ / / _ /. _ _ /. _ / _ /\\ version 2.4.2 / _ / Using Scala version 2.11.12 (Java HotSpot (TM) 64-Bit Server VM, Java 1.8.0mm 131) Type in expressions to have them evaluated.Type: help for more information.scala > spark.sql ("select * from student.student"). Show () 08:04:42 on 19-05-21 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/home/hadoop/app/spark-2.4.2-bin-hadoop-2.6.0-cdh6.7.0/jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/home/hadoop/app/spark/jars/datanucleus-core-3.2.10.jar." 08:04:42 on 19-05-21 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/home/hadoop/app/spark/jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/home/hadoop/app/spark-2.4.2-bin-hadoop-2.6.0-cdh6.7.0/jars/datanucleus-api-jdo-3.2.6.jar." 08:04:42 on 19-05-21 WARN DataNucleus.General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/home/hadoop/app/spark-2.4.2-bin-hadoop-2.6.0-cdh6.7.0/jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/home/hadoop/app/spark/jars/datanucleus-rdbms-3.2.9.jar." 08:04:45 on 19-05-21 ERROR metastore.ObjectStore: Version information found in metastore differs 1.1.0 from expected schema version 1.2.0. Schema verififcation is disabled hive.metastore.schema.verification so setting version.19/05/21 08:04:46 WARN metastore.ObjectStore: Failed to get database global_temp Returning NoSuchObjectException+-+ | stu_id | stu_name | stu_phone_num | stu_email | + -+ | 1 | Burke | 1,300,746-8446 | ullamcorper.velit... | | 2 | Kamal | 1,668,571-5046 | pede.Suspendisse@... | | 3 | Olga | 1,956-1686 | Aenean.eget.metus... | 4 | Belle | 1,246,894-6340 | vitae.aliquet.nec... | | 5 | Trevor | 1,300,527-4967 | dapibus.id@acturp... | 6 | Laurel | 1,691,379-9921 | adipiscing@consec | | | 7 | Sara | 1,608,140-1995 | Donec.nibh@enimEt... | | 8 | Kaseem | 1,881,586-2689 | cursus.et.magna@e... | | 9 | Lev | 1,916,367-5608 | Vivamus.nisi@ipsu... | 10 | Maya | 1,271,683-2698 | accumsan.convalli... | | 11 | Emi | 1,467-1337 | est@nunc.com | 12 | Caleb | 1,683,212-0896 | Suspendisse@Quisq... | 13 | Florence | 1,603,575 | -2444 | sit.amet.dapibus@... | 14 | Anika | euismod@ligulaeli... | | 15 | Tarik | 1,398,171-2268 | turpis@felisorci.com | | 16 | Amena | 1,878,250-3129 | lorem.luctus.ut@s... | 17 | Blossom | 1,154,406-9596 | Nunc.commodo.auct... | | 18 | Guy | 1,869,521-3230 | senectus.et.netus... | 19 | Malachi | 1,608,637-2772 | Proin.mi.Aliquam@... | 20 | Edward | 1,711 | -710-6552 | lectus@aliquetlib... | +-+ only showing top 20 rows
Solved
At this point, we can also check the ui interface to see if the mysql driver package has been added to the task.
It was found that the addition was successful.
Error 2: background
An error occurred while starting spark-sql
Spark-sql log at 08:54:14 on 19-05-21 ERROR Datastore.Schema: Failed initialising database.Unable to open a test connection to the given database. JDBC url = jdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist=true, username = root. Terminating connection pool (set lazyInit to true if you expect to start your database after your app) .Original Exception:-java.sql.SQLException: No suitable driver found forjdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist= trueCaused by: java.sql.SQLException: No suitable driver found forjdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist=true reason
There is no mysql driver on the driver side, and the metadata of hive cannot be connected.
Solve
Look at my startup command and find that the path to the mysql driver package has been specified using-- jars
And according to the description of spark-sql-- help, you can add the specified jar package to the classpaths of driver and executor
-- jars JARS Comma-separated list of jars to include on the driver and executor classpaths.
But driver's classpath still doesn't have a mysql driver. Why? I don't know yet, so I tried to add the classpath path of driver when I started.
Spark-sql-- jars softwares/mysql-connector-java-5.1.47.jar-- driver-class-path softwares/mysql-connector-java-5.1.47.jar 19-05-21 09:19:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable19/05/21 09:19:31 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore.19/05/21 09:19:36 INFO spark.SparkEnv: Registering OutputCommitCoordinator19/05/21 09:19:37 INFO util.log: Logging initialized @ 8235ms19/05/21 09:19:37 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown Git hash: unknown19/05/21 09:19:37 INFO server.Server: Started @ 8471ms19/05/21 09:19:37 INFO server.AbstractConnector: Started ServerConnector@55f0e536 {HTTP/1.1 [http/1.1]} {0.0.0.0 SparkUI' on port 4040} 09:19:37 on 19-05-21 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.19 Accord 05 Accord 21 09:19:38 INFO internal.SharedState: Warehouse path is' file:/home/hadoop/spark-warehouse'.19/05/21 09:19:38 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16e1219f {/ SQL,null,AVAILABLE @ Spark} 19-05-21 09:19:38 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@13f40d71 {/ SQL/json,null,AVAILABLE,@Spark} 19-05-21 09:19:38 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@34f7b44f {/ SQL/execution,null,AVAILABLE,@Spark} 19-05-21 09:19:38 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5403907 {/ SQL/execution/json,null,AVAILABLE @ Spark} 09:19:38 on 19-05-21 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7911cc15 {/ static/sql,null,AVAILABLE @ Spark} 19-05-21 09:19:38 INFO hive.HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.19/05/21 09:19:38 INFO client.HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is file:/home/hadoop/spark-warehouse19/05/21 09:19:38 INFO hive.metastore: Mestastore configuration hive.metastore.warehouse.dir changed from / user/hive/warehouse to file:/home/hadoop/spark-warehouse19/05/21 09:19:38 INFO metastore.HiveMetaStore: 0: Shutting down the object store...19/05/21 09:19:38 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Shutting down the object store... 19-05-21 09:19:38 INFO metastore.HiveMetaStore: 0: Metastore shutdown complete.19/05/21 09:19:38 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Metastore shutdown complete. 19-05-21 09:19:38 INFO metastore.HiveMetaStore: 0: get_database: default19/05/21 09:19:38 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_database: default19/05/21 09:19:38 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore19/05/21 09:19:38 INFO metastore.ObjectStore: ObjectStore Initialize called19/05/21 09:19:38 INFO DataNucleus.Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing19/05/21 09:19:38 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL19/05/21 09:19:38 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:19:39 INFO state.StateStoreCoordinatorRef: Registered StateStoreCoordinator endpointSpark master: local [*], Application Id: local-155844477755319/05/21 09:19:40 INFO thriftserver.SparkSQLCLIDriver: Spark master: local [*] Application Id: local-1558444777553spark-sql (default) >
Start successfully, it seems that the official explanation is not necessarily accurate ah, or to be measured! Small pit
Error 3: background
When converting RDD to DataFrame in IDEA, an error occurs when there is no problem with the code (the same code succeeds in the spark-shell test).
The code is as follows:
Object SparkSessionDemo {/ * create SparkSession Test * * @ param args * / def main (args: Array [String]): Unit = {val spark=SparkSession .builder () .master ("local [*]") .appName ("SparkSessionDemo") .getOrCreate () import spark.implicits._ val logs=spark.sparkContext.textFile ("file:///D:/cleaned.log") .map (_ .split ("\ t")) val logsDF=logs.map (x = > CleanedLog (x (0)) X (1), x (2), x (3), x (4), x (5), 1L logsDF.show x (6), x (7)). ToDF () logsDF.show ()} case class CleanedLog (cdn:String,region:String,level:String,date:String,ip:String, domain:String, pv:Long,url:String, traffic:String)} console error log
There were two error reports, but for the same reason, in order to save space, I put them together.
1. Exception in thread "main" java.lang.NoClassDefFoundError:org/codehaus/janino/InternalCompilerException...Caused by: java.lang.ClassNotFoundException: org.codehaus.janino.InternalCompilerException at java.net.URLClassLoader.findClass (URLClassLoader.java:381) at java.lang.ClassLoader.loadClass (ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass (Launcher.java:331) at java.lang.ClassLoader.loadClass (ClassLoader.java:357). 33 more II. Exception in thread "main" java.lang.NoSuchMethodError:org.codehaus.commons.compiler.Location. (Ljava/lang/String II) V reason:
Jar package conflict
The janino and explorer packages required by spark-sql-2.4.2 are different from those required by hive-1.1.0. Because the version of hive is too low and added to pom.xml before spark-sql, the jar package used in hive is used by default when there is a jar package conflict.
Resolve:
Use the maven helper plug-in to add exclusion tags to the lower versions of conflicting jar packages janino and compiler. For more information, please refer to my other blog article, IDEA-plugin, which is a necessary medicine for resolving jar conflicts in maven project Maven Helper.
After running successfully, it is like this.
+-- + | cdn | region | level | date | ip | domain | pv | Url | traffic | +-- + | baidu | CN | E | 2018050102 | 121.77.52. 1 | github.com | 1 | http://github.com...| 8732 | | baidu | CN | E | 2018050102 | 182.90.237.16 | yooku.com | 1 | http://yooku.com/...| 39912 | baidu | CN | E | 2018050101 | 171.14.31.240 | movieshow2000.edu... | 1 | http://movieshow2...| 37164 | baidu | CN | E | 2018050103 | 222.20.40.164 | movieshow2000.edu... | 1 | http://movieshow2...| 24963 | baidu | CN | E | 2018050102 | 171.14.203.164 | Github.com | 1 | http://github.com...| 24881 | | baidu | CN | E | 2018050103 | 61.237.150.187 | movieshow2000.edu... | 1 | http://movieshow2...| 90965 | baidu | CN | E | 2018050103 | 123.233.221.56 | movieshow2000.edu... | 1 | http://movieshow2...| 85190 | baidu | CN | E | 2018050103 | 171.9.197.248 | github.com | 1 | http://github.com...| 36031 | baidu | CN | E | 2018050101 | 2288.229.163 | yooku.com | | 1 | http://yooku.com/...| 35809 | | baidu | CN | E | 2018050101 | 139.208.68.178 | github.com | 1 | http://github.com...| 9787 | baidu | CN | E | 2018050103 | 171.12.206.176 | movieshow2000.edu... | 1 | http://movieshow2...| 83860 | baidu | CN | E | 2018050101 | 121.77.141.58 | github.com | 1 | http://github.com...| 89197 | baidu | CN | E | 2018050102 | 171.14.63.223 | rw.uestc.edu. Cn | 1 | http://rw.uestc.e...| 74642 | | baidu | CN | E | 2018050101 | 36.57.94.77 | yooku.com | 1 | http://yooku.com/...| 25020 | | baidu | CN | E | 2018050101 | 171.15.34.129 | movieshow2000.edu... | 1 | http://movieshow2...| 51978 | baidu | CN | E | 2018050101 | 121.76.172.122 | yooku.com | 1 | http://yooku.com/...| 48488 | baidu | CN | E | 2018050102 | 36.61.89.99 | Yooku.com | 1 | http://yooku.com/...| 86480 | | baidu | CN | E | 2018050102 | 182.89.232.143 | movieshow2000.edu... | 1 | http://movieshow2...| 24312 | | baidu | CN | E | 2018050102 | 139.205.123.192 | movieshow2000.edu... | 1 | http://movieshow2...| 51601 | | baidu | CN | E | 2018050102 | 123.233.171.52 | yooku.com | 1 | http://yooku.com/...| 14132 | +- -+-+ only showing top 20 rows
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.