In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly shows you "Atlas how to integrate HIve", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "Atlas how to integrate HIve" this article.
Atlas integrated Hive
After installing Atlas, you need to connect Atlas to other components if you want to use it.
One of the most commonly used is Hive.
Through the architecture of Atlas, as long as Hive Hook is configured, every time Hive does anything, it will be written to Kafka and received by atlas.
And show it in the form of a diagram in Atlas.
Hive Model
What operation information will be recorded in Hive? Altas defines Hive Model.
Contains the following:
1. Entity type: hive_db
Type: Asset
Attributes: qualifiedName, name, description, owner, clusterName, location, parameters, ownerName
Hive_table
Type: DataSet
Attributes: qualifiedName, name, description, owner, db, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary
Hive_column
Type: DataSet
Attributes: qualifiedName, name, description, owner, type, comment, table
Hive_storagedesc
Type: Referenceable
Attributes: qualifiedName, table, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories
Hive_process
Type: Process
Attributes: qualifiedName, name, description, owner, inputs, outputs, startTime, endTime, userName, operationType, queryText, queryPlan, queryId, clusterName
Hive_column_lineage
Type: Process
Attributes: qualifiedName, name, description, owner, inputs, outputs, query, depenendencyType, expression
2. Enumeration types:
Hive_principal_type value: USER, ROLE, GROUP
3. Structural type
Hive_order attribute: col, order
Hive_serde attributes: name, serializationLib, parameters
The structure of the HIve entity:
Hive_db.qualifiedName: @
Hive_table.qualifiedName:. @
Hive_column.qualifiedName:.. @
Hive_process.queryString: trimmed query string in lower case
Configure Hive hook
Hive hook listens to the create/update/delete operation of hive. Here are the configuration steps:
1. Modify hive-env.sh (specify packet address)
Export HIVE_AUX_JARS_PATH=/opt/apps/apache-atlas-2.1.0/hook/hive
2. Modify hive-site.xml (restart hive after configuration)
Hive.exec.post.hooks
Org.apache.atlas.hive.hook.HiveHook
1234
Note that this is actually post-implementation monitoring, and there can be pre-implementation and in-execution monitoring.
3. Copy synchronous configuration copy atlas configuration file atlas-application.properties to hive configuration directory to add configuration:
Atlas.hook.hive.synchronous=false
Atlas.hook.hive.numRetries=3
Atlas.hook.hive.queueSize=10000
Atlas.cluster.name=primary
Atlas.rest.address= http://doit33:21000
Import Hive metadata into Atlas
Bin/import-hive.sh
Using Hive configuration directory [/ opt/module/hive/conf]
Log file for import is / opt/module/atlas/logs/import-hive.log
Log4j:WARN No such property [maxFileSize] in org.apache.log4j.PatternLayout.
Log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.PatternLayout.
Enter user name: admin; enter password: admin
Enter username for atlas:-admin
Enter password for atlas:-
Hive Meta Data import was successful!!!
Step on the pit full record 1. Can not find the class org.apache.atlas.hive.hook.hivehook
The third party jar package of hive was not added.
The tip is to use hive-shell to see if the jar package is added without set, which will print a list of configuration variables overridden by the user or configuration unit.
Taking joining elsaticsearch-hadoop-2.1.2.jar as an example, this paper describes several ways to join third-party jar in Hive.
1, add in hive shell
Hive > add jar / home/hadoop/elasticsearch-hadoop-hive-2.1.2.jar
Whether the connection mode is valid Hive Shell does not need to restart the Hive service to make it valid Hive Server is invalid
2Jar is put into the ${HIVE_HOME} / auxlib directory
Create the folder auxlib in ${HIVE_HOME}, and then put the custom jar file in that folder. This method does not require a restart of Hive. And it's more convenient.
Whether the connection method is valid or not Hive Shell does not need to restart the Hive service to effectively restart the Hive Server Hive service.
3HIVE.AUX.JARS.PATH and hive.aux.jars.path
The hive.aux.jars.path configuration of HIVE.AUX.JARS.PATH and hive-site.xml in hive-env.sh is not valid for the server, but only valid for the current hive shell. Different hive shell does not affect each other. Each hive shell needs to be configured and can be configured as a folder. HIVE.AUX.JARS.PATH and hive.aux.jars.path only support local files. Can be configured as a file or as a folder.
Whether the connection method is valid Hive Shell restart Hive service only takes effect Hive Server restart Hive service takes effect 2. HIVE reports error Failing because I am unlikely to write too
HIVE.AUX.JARS.PATH is not configured correctly
There is a passage in the hive-env.sh script
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
If ["${HIVE_AUX_JARS_PATH}"! = "]; then
Export HIVE_AUX_JARS_PATH=$ {HIVE_AUX_JARS_PATH}
Elif [- d "/ usr/hdp/current/hive-webhcat/share/hcatalog"]; then
Export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog
Fi
If you set a value to HIVE_AUX_JARS_PATH, / usr/hdp/current/hive-webhcat/share/hcatalog is ignored.
Hive can only read one HIVE_AUX_JARS_PATH
Centrally place our shared jar package in one place, and then establish a corresponding soft connection under / usr/hdp/current/hive-webhcat/share/hcatalog
Sudo-u hive ln-s / usr/lib/share-lib/elasticsearch-hadoop-2.1.0.Beta4.jar / usr/hdp/current/hive-webhcat/share/hcatalog/elasticsearch-hadoop-2.1.0.Beta4.jar above are all the contents of the article "how Atlas integrates HIve". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.