In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
How to use Tunnel SDK to upload and download MaxCompute complex type data, I believe that many inexperienced people do not know what to do. Therefore, this paper summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
How to upload complex type data to MaxCompute based on Tunnel SDK? Let's first introduce the MaxCompute complex data types:
Complex data type
MaxCompute uses the ODPS2.0-based SQL engine, which enriches the support for complex data types. MaxCompute supports ARRAY, MAP, STRUCT types, and can be used arbitrarily and provides matching built-in functions.
Type definition method constructor ARRAYarray;array > array (1, 2, 3); array (array (1, 2); array (3, 4) MAPmap;map > map ("K1", "v1", "K2", "v2"); map (1s, array ('ajar,' b'), 2S, array ('x,'y) STRUCTstruct;struct
< field1:bigint, field2:array, field3:map>Named_struct ('x, 1,'y, 2); named_struct ('field1', 100L,' field2', array (1,2), 'field3', map (1,100,2,200) complex type construction and operation functions return type signature annotations MAPmap (K key1, V value1, K key2, V value2,...) Using a given key/value pair to establish map, all key types must be the same, all value types must be the same, and all key of map in parameters can be returned as an array for any type of ARRAYmap_keys (Map m). Enter NULL, return NULLARRAYmap_values (MAP m), return all value of map in parameters as an array, enter NULL, return NULLintsize (MAP) to get a given number of MAP elements TABLEexplode (MAP) table generation function Expand a given MAP, each key/value row, with two columns corresponding to key and valueARRAYarray (T value1, T value2,...). Construct ARRAY using a given value, all value types consistent intsize (ARRAY) get a given number of ARRAY elements booleanarray_contains (ARRAY a, value v) detect whether a given ARRAY a contains vARRAYsort_array (ARRAY) sort the given array ARRAYcollect_list (T col) aggregate function, in a given group, aggregate the expressions specified by col into an array ARRAYcollect_set (T col) aggregate function, in a given group Aggregate the expressions specified by col into a set array TABLEexplode (ARRAY) table without repeating elements into a function, expand a given ARRAY, each value row, one column corresponding to the corresponding array element TABLE (int, T) posexplode (ARRAY) table into a function, expand a given ARRAY, each value row, two columns per row correspond to 0-based subscript and array elements STRUCTstruct (T1 value1, T2 value2,...) Create a struct using a given value list. Each value can be of any type. The names of the field that generate the struct are col1, col2,... STRUCTnamed_struct (name1, value1, name2, value2,...). Establish a struct using a given name/value list, and each value can be of any type. The names of the field that generate struct are name1, name2,. TABLE (F1 T1, f2T2,...) inline (ARRAY >) table generation function, expand the given struct array, each element corresponds to a row, each row each struct element corresponds to a column of Tunnel SDK introduction
Tunnel is the data channel of ODPS, and users can upload or download data to ODPS through Tunnel.
TableTunnel is an entry class for accessing ODPS Tunnel services and only supports uploading and downloading of table data (non-views).
The process of uploading and downloading a table or partition is called a session. The session consists of one or more HTTP Request to the Tunnel RESTful API.
Session is identified by session ID. The timeout of session is 24 hours. If mass data transfer results in more than 24 hours, it needs to be split into multiple session.
TableTunnel.UploadSession and TableTunnel.DownloadSession sessions are responsible for uploading and downloading data, respectively.
TableTunnel provides methods for creating UploadSession objects and DownloadSession objects.
Typical table data upload process:
1) create a TableTunnel
2) create a UploadSession
3) create RecordWriter and write to Record
4) submit upload operation
Typical table data download process:
1) create a TableTunnel
2) create a DownloadSession
3) create RecordReader and read Record
Construction of complex type data based on Tunnel SDK
Code example:
RecordWriter recordWriter = uploadSession.openRecordWriter (0); ArrayRecord record = (ArrayRecord) uploadSession.newRecord (); / / prepare data List arrayData = Arrays.asList (1,2,3); Map mapData = new HashMap (); mapData.put ("a", 1L); mapData.put ("c", 2L); List structData = new ArrayList (); structData.add ("Lily"); structData.add (18) / / set data to record record.setArray (0, arrayData); record.setMap (1, mapData); record.setStruct (2, new SimpleStruct ((StructTypeInfo) schema.getColumn (2). GetTypeInfo (), structData); / / write the record recordWriter.write (record); download complex type data from MaxCompute
Code example:
RecordReader recordReader = downloadSession.openRecordReader (0,1); / / read the record ArrayRecord record1 = (ArrayRecord) recordReader.read (); / / get array field data List field0 = record1.getArray (0); List longField0 = record1.getArray (Long.class, 0); / / get map field data Map field1 = record1.getMap (1); Map typedField1 = record1.getMap (String.class, Long.class, 1) / / get struct field data Struct field2 = record1.getStruct (2); run the instance
The complete code is as follows:
Import java.io.IOException;import java.util.ArrayList;import java.util.Arrays;import java.util.HashMap;import java.util.List;import java.util.Map;import com.aliyun.odps.Odps;import com.aliyun.odps.PartitionSpec;import com.aliyun.odps.TableSchema;import com.aliyun.odps.account.Account;import com.aliyun.odps.account.AliyunAccount;import com.aliyun.odps.data.ArrayRecord;import com.aliyun.odps.data.RecordReader;import com.aliyun.odps.data.RecordWriter Import com.aliyun.odps.data.SimpleStruct;import com.aliyun.odps.data.Struct;import com.aliyun.odps.tunnel.TableTunnel;import com.aliyun.odps.tunnel.TableTunnel.UploadSession;import com.aliyun.odps.tunnel.TableTunnel.DownloadSession;import com.aliyun.odps.tunnel.TunnelException;import com.aliyun.odps.type.StructTypeInfo;public class TunnelComplexTypeSample {private static String accessId = "; private static String accessKey ="; private static String odpsUrl ="; private static String project ="; private static String table =" / / partitions of a partitioned table, eg: "pt=\'1\', ds=\'2\'" / / if the table is not a partitioned table, do not need it private static String partition = ""; public static void main (String args []) {Account account = new AliyunAccount (accessId, accessKey); Odps odps = new Odps (account); odps.setEndpoint (odpsUrl); odps.setDefaultProject (project); try {TableTunnel tunnel = new TableTunnel (odps) PartitionSpec partitionSpec = new PartitionSpec (partition); / /-Upload Data-/ / create upload session for table / / the table schema is {"col0": ARRAY, "col1": MAP, "col2": STRUCT} UploadSession uploadSession = tunnel.createUploadSession (project, table, partitionSpec); / / get table schema TableSchema schema = uploadSession.getSchema () / / open record writer RecordWriter recordWriter = uploadSession.openRecordWriter (0); ArrayRecord record = (ArrayRecord) uploadSession.newRecord (); / / prepare data List arrayData = Arrays.asList (1,2,3); Map mapData = new HashMap (); mapData.put ("a", 1L); mapData.put ("c", 2L); List structData = new ArrayList (); structData.add ("Lily"); structData.add (18) / / set data to record record.setArray (0, arrayData); record.setMap (1, mapData); record.setStruct (2, new SimpleStruct ((StructTypeInfo) schema.getColumn (2). GetTypeInfo (), structData); / / write the record recordWriter.write (record); / / close writer recordWriter.close () / / commit uploadSession, the upload finish uploadSession.commit (new Long [] {0L}); System.out.println ("upload success!"); / /-Download Data-/ / create download session for table / / the table schema is {"col0": ARRAY, "col1": MAP, "col2": STRUCT} DownloadSession downloadSession = tunnel.createDownloadSession (project, table, partitionSpec) Schema = downloadSession.getSchema (); / / open record reader, read one record here for example RecordReader recordReader = downloadSession.openRecordReader (0,1); / / read the record ArrayRecord record1 = (ArrayRecord) recordReader.read (); / / get array field data List field0 = record1.getArray (0); List longField0 = record1.getArray (Long.class, 0); / / get map field data Map field1 = record1.getMap (1) Map typedField1 = record1.getMap (String.class, Long.class, 1); / / get struct field data Struct field2 = record1.getStruct (2); System.out.println ("download success!");} catch (TunnelException e) {e.printStackTrace ();} catch (IOException e) {e.printStackTrace () } after reading the above, have you mastered how to upload and download MaxCompute complex type data using Tunnel SDK? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.