Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to automatically obtain Json file metadata information in Spark2.2.0 practice how to register two temporary tables and merge the same record data after conditional query

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail for you how to automatically obtain Json file metadata information in Spark2.2.0 practice, register two temporary tables and merge the same record data after a conditional query, the content of the article is of high quality, so the editor will share it for you to do a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Spark supports two ways to convert RDD to DataFrame

1. Reflection; define the schema information in a separate class and convert it into the corresponding DataFrame through this scheme, which is simple, but not recommended, because the case class of scala only supports a maximum of 22 fields, so you must develop a class to implement the product interface.

two。 Through the programming interface, you can build StruntType and convert RDD into corresponding DataFrame, which is a little troublesome. The official website manual lists roughly three steps:

Translate the general meaning:

1. Create RDD to convert to JavaRDD

two。 Define StructType according to the data structure of Row

3. Using createDataFrame to create DataFrame based on StructType

Data preparation:

The first json file student.json

{"name": "ljs1", "score": 85} {"name": "ljs2", "score": 99} {"name": "ljs3", "score": 74}

The second json data is written directly in the lower 46-49 lines of the code, which can be obtained directly from the code.

Code example:

Package com.unicom.ljs.spark220.study

Import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaPairRDD;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.api.java.JavaSparkContext;import org.apache.spark.api.java.function.Function;import org.apache.spark.api.java.function.PairFunction;import org.apache.spark.sql.*;import org.apache.spark.sql.types.DataTypes;import org.apache.spark.sql.types.StructField;import org.apache.spark.sql.types.StructType Import scala.Tuple2

Import java.util.ArrayList;import java.util.List

/ * @ author: Created By lujisen * @ company ChinaUnicom Software JiNan * @ date: 2020-01-28 21:08 * @ version: v1.0 * @ description: com.unicom.ljs.spark220.study * / public class JoinJsonData {public static void main (String [] args) {

SparkConf sparkConf = new SparkConf () .setMaster ("local [*]") .setAppName ("JoinJsonData"); JavaSparkContext sc=new JavaSparkContext (sparkConf); SQLContext sqlContext=new SQLContext (sc)

Dataset studentDS = sqlContext.read () .json ("D:\\ dataML\\ spark1\\ student.json"); studentDS.registerTempTable ("student_score"); Dataset studentNameScoreDS = sqlContext.sql ("select name,score from student_score where score > 82")

List studentNameList= studentNameScoreDS.javaRDD () .map (new Function () {@ Override public String call (Row row) {return row.getString (0);}}) .collect ()

System.out.println (studentNameList.toString ())

List studentJsons=new ArrayList (); studentJsons.add ("{\" name\ ":\" ljs1\ ",\" age\ ": 18}"); studentJsons.add ("{\" name\ ":\" ljs2\ ",\" age\ ": 17}"); studentJsons.add ("{\" name\ ":\" ljs3\ ",\" age\ ": 19}"))

JavaRDD studentInfos = sc.parallelize (studentJsons); Dataset studentNameScoreRDD = sqlContext.read () .json (studentInfos)

StudentNameScoreRDD.schema (); studentNameScoreRDD.show (); studentNameScoreRDD.registerTempTable ("student_age")

String sql2= "select name,age from student_age where name in ("; for (int item0)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report