Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

[SQL] spark sql is not equivalent join

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Products A table of changes in commodity prices, orders orders for goods, recording each purchase and date

Based on the non-equivalent join in Spark SQL, the matching of orders and products is realized, and the goods in each order are counted corresponding to the current price.

A slowly changing price list

Wang Zai milk, there has been a price change.

Scala > val products = sc.parallelize (Array (("Wang Zai Milk", "2017-01-01", "2018-01-01", 4), | ("Wang Zai Milk", "2018-01-02", "2020-01-01", 5), | ("Wang Laoji", "2017-01-02", "2019-01-01", 5), | ("Weilong spicy Gluten", "2010-01-01") "2020-01-01", 2) |). ToDF ("name", "startDate", "endDate", "price") products: org.apache.spark.sql.DataFrame = [name: string, startDate: string... 2 more fields] scala > products.show () +-+ | name | startDate | endDate | price | +-+ | Wangzai Milk | 2017-01-01 | 2018-01-01 | 4 | Wangzai Milk | 2018-01-02 | 2020-01-01 | 5 | Wang Laoji | 2017-01-02 | 2019-01-01 | 5 | Weilong spicy gluten | 2010 -01-01 | 2020-01-01 | 2 | +-+

Order form (commodity name, order date)

Wang Tsai milk had an order in different price periods.

Scala > val orders = sc.parallelize (Array (| ("2017-06-01", "Wang Zai Milk"), | ("2017-07-01", "Wang Laoji"), | ("2018-03-01", "Wang Zai Milk") |) .toDF ("date", "product") orders: org.apache.spark.sql.DataFrame = [date: string " Product: string] scala > orders.show+-+-+ | date | product | +-+-+ | 2017-06-01 | Wang Zai Milk | | 2017-07-01 | Wang Laoji | | 2018-03-01 | Wang Zai Milk | +-+-+

Calculate the current commodity price of each order through an unequal connection

Check out the prices of the two orders in different periods of time.

Scala > orders.join (products, $"product" = $"name" & & $"date" > = $"startDate" & & $"date"

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report