In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
12 data format
The original data split or intercepted by [[upright 3percent, upright 5'], [upright 4percent, upright 6'], [upright 4percent, upright 5'], [upright 4percent, upright 2']] can be used to obtain the corresponding column data through x [0], x [1] in map.
Can be converted to key-value data format through map for example: df3 = df2.map (lambda x: (x [0], x [1]))
Key-value data format
Each () represents a set of data, the first represents key and the second represents value.
3) PipelinedRDD type represents key-value form data
13 RDD type conversion
UserRdd = sc.textFile ("D:\ data\ people.json")
UserRdd = userRdd.map (lambda x: x.split (""))
UserRows = userRdd.map (lambda p: Row (userName = p [0], userAge = int (p [1]), userAdd = p [2]) UserSalary = int (p [3])) print (userRows.take (4))
Results: [Row (userAdd='shanghai', userAge=20, userName='zhangsan', userSalary=13), Row (userAdd='beijin', userAge=30, userName='lisi', userSalary=15)]
2) create a DataFrame
UserDF = sqlContext.createDataFrame (userRows)
Query fields through sql statement
From pyspark.conf import SparkConf
From pyspark.sql.session import SparkSession
From pyspark.sql.types import Row
If name = 'main':
Spark = SparkSession.builder.config (conf = SparkConf (). GetOrCreate ()
Sc = spark.sparkContextrd = sc.textFile ("D:\ data\ people.txt") rd2 = rd.map (lambda x:x.split (",") people = rd2.map (lambda p: Row (name=p [0], age=int (p [1])) peopleDF = spark.createDataFrame (people) peopleDF.createOrReplaceTempView ("people") teenagers = spark.sql ("SELECT name" Age FROM people where name='Andy' ") teenagers.show (5) print (teenagers.rdd.collect ()) teenNames = teenagers.rdd.map (lambda p: 100 + p.age). Collect () for name in teenNames: print (name)
15 detailed examples of dateFrame,sql,json usage
#
Licensed to the Apache Software Foundation (ASF) under one or morecontributor license agreements. See the NOTICE file distributed withthis work for additional information regarding copyright ownership.The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance withthe License. You may obtain a copy of the License at
#
Http://www.apache.org/licenses/LICENSE-2.0
#
Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "ASIS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.
#
"
A simple example demonstrating basic Spark SQL features.
Run with:
. / bin/spark-submit examples/src/main/python/sql/basic.py
"
From future import print_function
$example on:init_session$
From pyspark.sql import SparkSession
$example off:init_session$$example on:schema_inferring$
From pyspark.sql import Row
$example off:schema_inferring$$example on:programmatic_schema$Import data types
From pyspark.sql.types import *
$example off:programmatic_schema$
Def basic_df_example (spark):
$example on:create_df$# spark is an existing SparkSessiondf = spark.read.json ("/ data/people.json") # Displays the content of the DataFrame to stdoutdf.show () # +-- +-+ # | age | name | # +-+-+ # | null | Michael | # | 30 | Andy | # | 19 | Justin | # +-+-+ # $example off:create_df$# $example on:untyped_ops$# spark Df are from the previous example# Print the schema in a tree formatdf.printSchema () # root# |-- age: long (nullable = true) # |-- name: string (nullable = true) # Select only the "name" columndf.select ("name"). Show () # +-+ # | name | # +-# | Michael | # | Andy | # | Justin | # +-+ # Select everybody, but increment the age by 1df.select (df ['name'] Df ['age'] + 1). Show () # +-+ # | name | (age + 1) | # +-+ # | Michael | null | # | Andy | 31 | # | Justin | 20 | # +-+ # Select people older than 21df.filter (df [' age'] > 21) ). Show () # +-- +-+ # | age | name | # +-- +-+ # | 30 | Andy | # +-- +-+ # Count people by agedf.groupBy ("age"). Count () .show () # +-- +-- + # | age | count | # +-+ # | 19 | 1 | # | null | 1 | # | 1 | # +-- +-- # $example off:untyped_ops$# $example on:run_sql$# Register the DataFrame as a SQL temporary viewdf.createOrReplaceTempView ("people") sqlDF = spark.sql ("SELECT * FROM people") sqlDF.show () # +-- +-- + # | age | name | # +-- +-+ # | null | Michael | # | 30 | Andy | # | 19 | Justin | # +-+-+ # $example off:run_sql$# $example on:global _ temp_view$# Register the DataFrame as a global temporary viewdf.createGlobalTempView ("people") # Global temporary view is tied to a system preserved database `global_ temp`spark.sql ("SELECT * FROM global_temp.people"). Show () # +-+-+ # | age | name | # +-+ # | null | Michael | # | 30 | Andy | # | 19 | Justin | # +-+-+ # Global temporary view is cross- Sessionspark.newSession () .sql ("SELECT * FROM global_temp.people") .show () # +-+-+ # | age | name | # +-+-+ # | null | Michael | # | 30 | Andy | # | 19 | Justin | # +-+-+ # $example off:global_temp_view$
Def schema_inference_example (spark):
$example on:schema_inferring$sc = spark.sparkContext# Load a text file and convert each line to a Row.lines = sc.textFile ("examples/src/main/resources/people.txt") parts = lines.map (lambda l: l.split (",") people = parts.map (lambda p: Row (name=p [0], age=int (p [1])) # Infer the schema And register the DataFrame as a table.schemaPeople = spark.createDataFrame (people) schemaPeople.createOrReplaceTempView ("people") # SQL can be run over DataFrames that have been registered as a table.teenagers = spark.sql ("SELECT name FROM people WHERE age > = 13 AND age")
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.