In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces you how to carry out Serverless SQL big data analysis based on Data Lake Analytics, the content is very detailed, interested friends can refer to, hope to be helpful to you.
Background introduction
TableStore (OTS for short) is a distributed form system of Aliyun, which provides users with distributed form services of schema-free. As more and more users have a strong demand for OLAP, we provide a way to access Data Lake Analytics (DLA) services on table storage to provide a fast OLAP solution. DLA is a general SQL query engine on Aliyun. By connecting DLA services in OTS, it uses the general SQL language (compatible with most of mysql5.7 query syntax) to do flexible data analysis tasks on table storage.
Architecture View
As shown in the figure above, the overall OLAP query architecture involves three Ali Cloud products: DLA,OTS,OSS. DLA is responsible for distributed SQL query computing. In the actual running process, users' sql query requests will be disassembled, resulting in a number of parallelized sub-tasks to improve data computing and query capabilities. OTS is a data storage layer that is used to receive various subquery tasks of DLA. If users already have a stock of data on OTS, they can directly establish a mapping view on DLA to quickly experience the convenience of SQL computing. OSS is a distributed object storage system, which is mainly used to save the query result data of users.
Therefore, if users want to experience SQL on OTS quickly, they must complete the activation of DLA and OSS services on the premise of activating OTS. Through the cooperation of the above three cloud products, users can quickly perform SQL computing on OTS. Currently, the main reason for activating OSS service is that DLA defaults to write the query result set data back to OSS storage, so you need to introduce an additional storage dependency, but only rely on users to activate OSS service, and do not require users to create OSS storage instances in advance.
At present, the area for public testing is Shanghai, and the corresponding examples are all the capacity-based instances in the region. When activating the DLA service, you need to fill in the public test application first, and then follow the steps in the "access method" section to quickly complete the access experience.
Access mode
The whole service access mainly includes OTS, OSS and DLA. It should be noted that after the access is completed, the corresponding fee will be incurred according to the actual query. If in this process, the user account is in arrears, the query will fail.
OTS service activation
If the user has activated the OTS service and it already contains instances of stock and tabular data, this step will be ignored.
For users who use OTS for the first time, OTS can be activated in the following ways:
Log in to https://www.aliyun.com
Go to "products"-> "Cloud Computing Foundation"-> "Database"-> "Table Storage TableStore"
Quickly create examples and tables to experience as described in the above documentation
1) use the console to quickly create a test form:
2) use the console to quickly insert test data:
OSS service activation
Log in to https://www.aliyun.com
Enter "products"-> "Cloud Computing fundamentals"-> "Storage Services"-> "object Storage OSS"
Just click on the service to activate.
After the OSS service is activated, you do not need to create scene object instances. When DLA is connected, an object storage instance is automatically created for users in the OSS service to store query result data. Users do not need to care.
DLA service activation
Log in to https://www.aliyun.com
Enter "products"-> "big data"-> "big data calculation"-> "Data Lake Analytics"
Click directly to activate the service.
Note: when you are in the stage of public trial, you need to apply for public test and fill in the relevant information.
DLA on OTS access
Follow these steps to establish a mapping of OTS on DLA:
After activating the DLA service, you can select a different region and activate the DLA service instance corresponding to the region (such as the current Shanghai region of East China 2). Different region correspond to different accounts and DLA accounts of different region cannot be mixed.
! [service_open.jpg] (http://ata2-img.cn-hangzhou.img-pub.aliyun-inc.com/ecc5446d0d8298adc493a72de5567d3d.jpg) Note: after the account is created, you will receive the relevant email (the mailbox is Aliyun's registered email address), including the DLA account and password of the region. Check it.
Select region to authorize DLA to access user instance data on OTS.
After the service is activated, there are three SQL access methods: console and mysql client,JDBC.
Console access
Click the database connection and use the user name and password of the region in the email to connect to the console.
After entering the console, you need to establish mapping information for the instance table data on the OTS. Scenario example: suppose the user already has an instance named sh_tpch in Shanghai region, which contains table test001 containing 2 rows of test data. The steps to establish a mapping for this instance include:
1) Map an instance of ots to an instance of DataBase of DLA:
Before establishing the Database mapping of DLA, you first need to create a table storage instance instance on OTS, such as:
Create an instance named sh-tpch, and the corresponding endpoint is https://sh-tpch.cn-shanghai.ots.aliyuncs.com.
After completing the creation of the test instance, execute the following statement to establish the Database mapping:
CREATE SCHEMA sh_tpch001 with DBPROPERTIES (LOCATION = 'https://sh-tpch.cn-shanghai.ots.aliyuncs.com', catalog='ots', instance =' sh-tpch'); Note: when using mysql client, you can use create database or create schema statements to create db maps; however, on the console, only create schema statements are currently supported to create db maps.
In the above statement, a database named sh_tpch001 will be created on DLA, and the corresponding instance will be an instance named sh-tpch under ots's sh-tpch.cn-shanghai.ots.aliyuncs.com cluster. Through the above statement, you can produce an instance mapping of ots.
2) under the DB of tp_tpch001, establish the mapping of the table:
Before establishing the table mapping of DLA, you first need to create a test table in OTS. The process can be found in the section "OTS Service Activation".
After the test table is created, execute the following statement to establish the table mapping:
CREATE TABLE test001 (pk0 int, primary key (pk0)); Note: when mainly creating a DLA mapping table, the specified Primary Key must be consistent with the Primary Key list defined by the OTS table. Because Primary Key must be the only one to locate a row, once the Primary Key list of the mapping table does not match the competition of the OTS table, it may cause unexpected errors in the SQL query results.
For example, the user's OTS instance sh_tpch contains a test001 table with only one column of pk0. The above command completes the creation of the mapping table test001 on the instance sh_tpch001 of DLA. Use the show command to see that the table was created successfully.
3) use select statement to execute sql query:
1. Find out all the data: select * from test001
two。 Perform count statistics: select count (*) from test001;3. Perform sum statistics: select sum (pk0) from test001
4) for more rich execution statements, please see the following help documentation:
Create schema statement: https://help.aliyun.com/document_detail/72005.htmlcreate table statement: https://help.aliyun.com/document_detail/72006.htmlselect statement: https://help.aliyun.com/document_detail/71044.htmlshow statement: https://help.aliyun.com/document_detail/72011.htmldrop table statement: https://help.aliyun.com/document_detail/72008.htmldrop schema statement: https://help.aliyun.com/document_detail/72007.html
5) when performing SQL execution, you can select the synchronous execution result and return the first 10000 records that meet the conditions. If you want to obtain large result set data, you need to select asynchronous execution and use show query_id to obtain the result asynchronously:
Show query_task where id = '59a05af7' 1531893489231'
Mysql access
Using standard mysql client, you can also quickly connect data instances of DLA. The connection statement is:
Mysql-h service.cn-shanghai.datalakeanalytics.aliyuncs.com-P 10000-u-p-c-A
Other operation statements are consistent with the "console access" section.
JDBC access
You can also access it using standard java api. The connection string is as follows:
Jdbc:mysql://service.cn-shanghai.datalakeanalytics.aliyuncs.com:10000/ on how to carry out Data Lake Analytics-based Serverless SQL big data analysis is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.