Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to authenticate and authorize metastore in Hive

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to authenticate and authorize metastore in Hive. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

2 introduction to Metastore2.1 Metastore service

Metastore maintains two main types of data:

Database, table, partition and other data, which can be regarded as DDL operation

Data of permissions and role classes, which can be regarded as DCL operations

Metastore enables the operation interface for the above two kinds of data by enabling the thrift rpc service. The interface, that is, the content of ThriftHiveMetastore.Iface, is shown in the following figure.

The operation of the first kind of data is as follows:

The operation of the second data is as follows:

Metastore implements this interface as: HMSHandler, which mainly has the following attributes:

ThreadLocal threadLocalMS

RawStore is mainly used to deal with database, store the above two kinds of related data, and bind RawStore to the current thread. The default implementation is ObjectStore.

List preListeners

When the above two types of data are about to change, the above MetaStorePreEventListener will be called first to perform some preprocessing operations, such as verifying whether a user has permission to perform the operation.

List listeners

When the first type of data is changed, the above MetaStoreEventListener will be called to perform some processing

2.2 Authentication and verification of Metastore services

It can be seen from the above that authentication operations can be performed before the data of metastore is modified. Hive provides an AuthorizationPreEventListener by default to implement the above MetaStorePreEventListener interface to perform some authentication operations.

As you can see from the above, it is also strange that only the DDL operation is verified, but not the DCL operation. That is, if you can connect directly to the metastore service, you can authorize the operation at will.

Next, let's take a closer look at the process of authentication and authorization.

2 interfaces are involved:

HiveMetastoreAuthenticationProvider: authentication provider for user authentication. A HiveMetastoreAuthenticationProvider object corresponds to a user, which contains the user's userName and groups information.

HiveMetastoreAuthorizationProvider: authorization provider for user's authentication

The following will be explained in detail

2.2.1 Authentication of Metastore services

The default implementation of the above HiveMetastoreAuthenticationProvider is HadoopDefaultMetastoreAuthenticator, which uses UserGroupInformation in hadoop to get the current user name and group information, as shown below

How are the client-side users of metastore delivered here?

This involves the thrift rpc service enabled by metastore. Metastore uses thrift's TThreadPoolServer as server. In this mode, a thread pool will be started, and each client connection will take out a thread to deal with the connection, and the connection will occupy the thread all the time. This is the traditional BIO method, which can support a small number of concurrency.

There are three types of rpc services enabled by metastore:

Use sasl method: it is a secure way to use kerberos to authenticate the user's identity. This section contains a lot of content and may be analyzed in detail later.

SetUgi mode:

Is an insecure way, that is, the user's identity is not verified. This mode is used when hive.metastore.execute.setugi=true. In this mode, do the following:

The client side will pass the user's ugi information to the server side through the set_ugi method.

The server binds the user's ugi information to the current connection

The client side sends a request for operation data to the server, and the server takes out the ugi information in the connection and uses the doas method of ugi to perform the operation.

Client takes up a thread on the server side and creates a HiveMetastoreAuthenticationProvider object. In the process of creation, the object needs to obtain the current ugi information, that is, the above ugi information, and bind the above HiveMetastoreAuthenticationProvider object of client to the thread. At this point, you can get the user information of client. On the other hand, the groups information is obtained by the default way of hadoop, that is, the groups information to which the user belongs on the machine where the metastore service is located.

Other:

Just bind the ip of client to the current thread, there is no so-called user information.

The above sasl is also used to obtain user information later through ugi.doas, but sasl has more verification process of user identity legitimacy. In setUgi mode, the user is regarded as what user is transmitted from the client, so only sasl is the safe way, and the other two are non-secure.

2.2.2 the right to verify Metastore services

In the previous section, we obtained the user information. The following is to verify whether the user has the permission to perform relevant operations. The authentication API is HiveMetastoreAuthorizationProvider, and the implementation classes are as follows:

The default is DefaultHiveMetastoreAuthorizationProvider.

DefaultHiveMetastoreAuthorizationProvider:

Get permission information from the database based on userName and groups information to verify whether the user has permissions

StorageBasedAuthorizationProvider:

Determine whether a user has permissions on databases, tables, etc., based on whether the user has corresponding permissions on these files on HDFS

3 Summary

The summary is as follows:

Metastore only authenticates the DDL operation, but not the DCL operation. Once the metastore is connected through hive cli, the user can authorize the operation at will.

According to the groups information that users and users belong to to obtain permissions, the maintenance of the relationship between users and groups in big data's system is best maintained by themselves, rather than with the help of the relationship between users and groups on the default linux machine, otherwise there will be a lot of trouble later.

Draw a simple diagram by default as follows:

This is the end of this article on "how to authenticate and authorize metastore in Hive". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report