In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Problem introduction
as a java programmer, with code directly similar to the SQL in the intersection and completion of the set operation, always have to write a lot of code, if there can be a special external data tool, by writing a simple script similar to SQL to achieve, directly called in java and can return the result set, it would be better. The Java version of the aggregator is the artifact to solve this problem. Through the SPL script, you can write the operation intuitively and naturally, and then use java to call the SPL script, which is simple, fast and efficient. In addition, although SQL has the concept of set, the support for ordered set operation is very limited, and it is often difficult to understand. SPL is based on discrete data set model and can easily deal with ordered set operation. Let's go from shallow to deep and give an example of how to use it.
SPL implementation and set
example 1: find the total number of days of overlapping time periods
MySQL8:
With recursive t (start,end) as (select date'2010-01-07 union all select date'2010-01-15 calendar 2010-01-16 'union all select date'2010-01-07 select start,end from t union all select from T1 where dselect count (distinct d) from T1) With recursive t (start,end) as (select date'2010-01-07 union all select date'2010-01-15 calendar 2010-01-16 'union all select date'2010-01-07 select start,end from t union all select from T1 where dselect count (distinct d) from T1)
description: this example first converts each time period to the corresponding date of all days in the time period, and then calculates the number of different dates.
Aggregator SPL:
A1=connect ("mysql") 2=A1.query@x ("periods (start,end)) 4=A3.conj () 5=A4.icount ()
A3: construct a date sequence from start to end for each time period in A2
A4: find the sum of all date sequences in A3
A5: find the number of dates that are not repeated in A4
saves the script file SumSet.dfx (used for embedding Java)
Difference set
example 1: list countries with English-speaking and French-speaking populations both exceeding 5%
MySQL8:
With T1 (lang) as (select 'English' union all select' French') select name from world.country c where not exists (select * from T1 where lang not in (select language from world.countrylanguage where percentage > = 5 and countrycode=c.code); with T1 (lang) as (select 'English' union all select' French') select name from world.country c where not exists (select * from T1 where lang not in (select language from world.countrylanguage where percentage > = 5 and countrycode=c.code))
description: this SQL only demonstrates that the difference set is empty through double negation.
Aggregator SPL:
A1=connect ("mysql") 2=A1.query ("select CountryCode,Name,Language,Percentage from world.countrylanguage cl join world.country con cl.countrycode=c.code where percentage > 5") 3=A2.group (CountryCode) 4=A3.select (["English", "French"]\ ~. (Language) = = []) 5=A4.new (~ .name: name)
A4: select the group whose difference between [English "," French "and the language set of this group is empty, which means that the language set contains English and French.
saves the script file DifferenceSet.dfx (used for embedding Java)
Intersection
example 1: list country codes with English-speaking, French-speaking and Spanish-speaking populations exceeding 0.3%, 0.2% and 0.1%, respectively
MySQL8:
With T1 as (select countrycodefrom world.countrylanguage where language='English' and percentage > 0.3), T2 as (select countrycodefrom world.countrylanguage where language='French' and percentage > 0.2), T3 as (select countrycodefrom world.countrylanguage where language='Spanish' and percentage > 0.1) select countrycodefrom T1 join T2 using (countrycode) join T3 using (countrycode) With T1 as (select countrycodefrom world.countrylanguage where language='English' and percentage > 0.3), T2 as (select countrycodefrom world.countrylanguage where language='French' and percentage > 0.2), T3 as (select countrycodefrom world.countrylanguage where language='Spanish' and percentage > 0.1) select countrycodefrom T1 join T2 using (countrycode) join T3 using (countrycode)
description: this example only demonstrates how to solve the intersection of multiple sets
Aggregator SPL:
A1=connect ("mysql") 2 [English,French,Spanish] 3 [0.3 6=A4.isect 0.2] 4=A2. (A1.query@i ("select countrycode from world.countrylanguage where language=? and percentage >?", ~, A3 (#)) 5 > A1.close ()
A3: query the codes of countries with English population over 0.3%, French population over 0.2%, Spanish population over 0.1% in order, and convert them into sequences
Intersection of all sequences in A5: A3
saves the script file IntersectionSet.dfx (used for embedding Java)
Java call
It is very convenient to embed SPL into the Java application. Load it through the JDBC call stored procedure and set the saved file SumSet.dfx with and set. The example calls are as follows:
... Connection con= null; Class.forName ("com.esproc.jdbc.InternalDriver"); con= DriverManager.getConnection ("jdbc:esproc:local://"); / / call the stored procedure, where SumSet is the file name of dfx st = (com. Esproc.jdbc.InternalCStatement) con.prepareCall ("call SumSet ()"); / execute the stored procedure st.execute (); / / get the result set ResultSet rs = st.getResultSet ();... Connection con= null; Class.forName ("com.esproc.jdbc.InternalDriver"); con= DriverManager.getConnection ("jdbc:esproc:local://"); / / call the stored procedure, where SumSet is the file name of dfx st = (com. Esproc.jdbc.InternalCStatement) con.prepareCall ("call SumSet ()"); / execute the stored procedure st.execute (); / / get the result set ResultSet rs = st.getResultSet ();
It's the same thing to replace with DifferenceSet.dfx or IntersectionSet.dfx, just call DifferenceSet () or call IntersectionSet (). Only Java snippets are used here to roughly explain how to embed SPL. For detailed steps, see how Java invokes the SPL script, which is also very simple and won't go into detail. At the same time, SPL also supports ODBC drivers, integrated into languages that support ODBC, and the embedding process is similar.
Extended excerpt
about set operations in addition to the sum and differential intersection operations mentioned above, you can also obtain calculations related to line numbers, as well as alignment operations for ordered sets.
Fetch data according to line number
example 1: calculate the trading information of the third trading day and the penultimate trading day of China Merchants Bank (600036) 2017
MySQL8:
With t as (select *, row_number () over (order by tdate) rn from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31') select tdate,open,close,volume from t where rn=3union all select tdate,open,close,volume from t where rn= (select max (rn)-2 from t) With t as (select *, row_number () over (order by tdate) rn from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31') select tdate,open,close,volume from t where rn=3union all select tdate,open,close,volume from t where rn= (select max (rn)-2 from t)
Aggregator SPL:
A1=connect ("mysql") 2=A1.query@x ("select * from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31' order by tdate") 3=A2 (3) | A2.m (- 3)
A3: the sum of Article 3 records and the penultimate Article 3 records
example 2: calculate the average closing price of China Merchants Bank (600036) for the last 20 trading days
MySQL8:
With t as (select *, row_number () over (order by tdate desc) rn from stktrade where sid='600036') select avg (close) avg20 from t where rn=25
Aggregator SPL:
A1=connect ("mysql") 2=A1.query@x ("select * from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31' order by tdate") 3=A2.pselect (close > = 25)
A3: find the first record location with a closing price of 25 yuan after going there.
example 2: calculate Gree Electric Appliances (000651) 2017 increase (consider suspension)
MySQL8:
With t as (select * from stktrade where sid='000651'), T1 (d) as (select max (tdate) from t where tdate2500000) 4=A3.new (A2 (~). Tdate:tdate, A2 (~). Close:close, A2 (~). Volume:volume, A2 (~). Close / A2 (- 1). Close-1:rise)
A3: find the line number of all records with a trading volume of more than 2.5 million shares in 2017
A4: calculate the corresponding date, closing price, trading volume and increase according to the line number.
Find the line number of the record where the maximum or minimum value is located
example 1: calculate the trading days between the earliest lowest price and the earliest high price of China Merchants Bank (600036) in 2017
MySQL8:
With t as (select *, row_number () over (order by tdate) rn from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31'), T1 as (select * from t where close= (select min (close) from t)), T2 as (select * from t where close= (select max (close) from t) select abs (cast (min (t1.rn) as signed)-cast (min (t2.rn) as signed) intevalfrom T1 T2 With t as (select *, row_number () over (order by tdate) rn from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31'), T1 as (select * from t where close= (select min (close) from t)), T2 as (select * from t where close= (select max (close) from t) select abs (cast (min (t1.rn) as signed)-cast (min (t2.rn) as signed) intevalfrom T1 T2
Aggregator SPL:
A1=connect ("mysql") 2=A1.query@x ("select * from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31' order by tdate") 3=A2.pmax (close) 4=A2.pmin (close) 5=abs (A3-A4)
A3: find the line number of the maximum closing price in the sequence from going back.
A4: find the line number of the minimum closing price in the sequence from the back.
example 2: calculate the trading day interval between the last lowest price and the last high price of China Merchants Bank (600036) in 2017
MySQL8:
With t as (select *, row_number () over (order by tdate) rn from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31'), T1 as (select * from t where close= (select min (close) from t)), T2 as (select * from t where close= (select max (close) from t) select abs (cast (max (t1.rn) as signed)-cast (max (t2.rn) as signed) intevalfrom T1 T2 With t as (select *, row_number () over (order by tdate) rn from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31'), T1 as (select * from t where close= (select min (close) from t)), T2 as (select * from t where close= (select max (close) from t) select abs (cast (max (t1.rn) as signed)-cast (max (t2.rn) as signed) intevalfrom T1 T2
Aggregator SPL:
A1=connect ("mysql") 2=A1.query@x ("select * from stktrade where sid='600036' and tdate between '2017-01-01' and '2017-12-31' order by tdate") 3=A2.pmax@z (close) 4=A2.pmin@z (close) 5=abs (A3-A4)
A3: find the line number of the maximum closing price in the sequence from back to front.
A4: find the line number of the minimum closing price in the sequence from back to front
Alignment calculation between ordered sets
example 1: calculate the daily relative return of gem Index (399006) to Shenzhen Composite Index (399001) from March 6 to 8, 2018.
MySQL8:
With T1 as (select *, close/lag (close) over (order by tdate) risefrom stktrade where sid='399006' and tdate between '2018-03-05' and '2018-03-08'), T2 as (select *, close/lag (close) over (order by tdate) risefrom stktrade where sid='399001' and tdate between '2018-03-05' and '2018-03-08') select t1.rise-t2.risefrom T1 join T2 using (tdate) where t1.rise is not null With T1 as (select *, close/lag (close) over (order by tdate) risefrom stktrade where sid='399006' and tdate between '2018-03-05' and '2018-03-08'), T2 as (select *, close/lag (close) over (order by tdate) risefrom stktrade where sid='399001' and tdate between '2018-03-05' and '2018-03-08') select t1.rise-t2.risefrom T1 join T2 using (tdate) where t1.rise is not null
Aggregator SPL:
A1=connect ("mysql") 2 = ["399006", "399001"]. (A1.query ("select * from stktrade where sid=? and tdate between '2018-03-05' and '2018-03-08'", ~)) 3 > A1.close () 4=A2. (~ .calc (to (2) 4), close/close [- 1]) 5=A4 (1)-- A4 (2)
A2: query 399006 and 399001 transaction data from March 5 to 8, 2018, in turn
A4: calculate the increase of the two sequenced tables in A2 from Article 2 to Article 4, that is, the daily increase of 399006 and 399001 from March 6 to 8, 2018.
A5: the daily relative rate of return can be calculated by decreasing the opposite phase.
SPL advantage
There is a library to write SQL, no library to write SPL
uses the Java program to directly summarize and calculate the data, which is still tiring, the code is very long and can not be reused, and in many cases the data is not in the database. With SPL, it is as convenient as using SQL in Java.
Commonly used worry-free, you can get the entry version of the right to use for life without spending money.
If the data to be analyzed by is one-off or temporary, the dry aggregator provides a free trial license every month and can be recycled for free use. However, if you want to integrate with the Java application and deploy it to the server for long-term use, it is still troublesome to change the trial license on a regular basis. Moisturizer provides an entry version with lifetime access, which solves this worry. How to use the dry aggregator for free?
Technical documentation and community support
The official aggregator technical documentation provided by itself has many ready-made examples, and solutions to common problems can be found in the documentation. If you get the entry version, you can not only use the general functions of SPL, but also go to the dry college for consultation if you encounter any problems. The official community provides free technical support to entry version users.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.