Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Ranger Hive-HDFS ACL synchronization

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article shares with you the content of a sample analysis of Ranger Hive-HDFS ACL synchronization. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Overview of Ranger Hive-HDFS ACL synchronization

The Ranger Resource Mapping Server (Resource Mapping Server:RMS) automatically converts access policies from Hive to HDFS.

About Hive-HDFS ACL synchronization

Older CDH users use the Hive policy in Apache Sentry, which automatically links Hive permissions with HDFS ACL. This is especially convenient for external table data used by Spark or Hive.

Previously, Ranger only supported managing Hive policies and HDFS policies respectively. With Ranger RMS now, you can use the policies defined for the Hive table to authorize access to HDFS directories and files. RMS is a service that enables Hive-HDFS ACL synchronization.

RMS periodically connects to Hive Metastore and extracts Hive metadata (database name, table name) to HDFS's file name mapping. The Ranger HDFS plug-in (which runs in NameNode) has been extended with additional HivePolicyEnforcer modules. The HDFS plug-in downloads the Hive policy from Ranger Admin and the mapping from Ranger RMS. HDFS access is determined by both HDFS policy and Hive policy.

Assumptions and limitations of Ranger RMS

Assume that all partitions of the table are under the location specified for the table. Therefore, the permissions of the table will not grant access to partitions that store data outside the specified location of the table. For example, if a table is in the HDFS directory / warehouse/foo, all partitions of the table must have a location under that / warehouse/foo directory.

When you deploy a CDP private cloud basic cluster, the Ranger RMS service is not set up automatically. You must install and configure Ranger RMS separately.

Before starting RMS and running the first synchronization from Hive Metastore (HMS), you should configure the Ranger policy (with rangerrms access).

The Ranger RMS ACL synchronization feature supports a single logical HMS to evaluate access to HDFS through Hive permissions. This is consistent with the Sentry implementation logic in CDH.

Permissions granted on views (legacy and materialized views) are not extended to HDFS access. This is consistent with the Sentry implementation in CDH.

If the private cloud basic deployment supports multiple logical HMS through a single Ranger, Ranger ACL synchronization applies to only one of the logical HMS. Permissions granted to the database / table in other logical HMS instances will not be considered authorized HDFS access.

Comparison with Sentry HDFS ACL synchronization

The RMS ACL synchronization feature is similar to the Sentry HDFS ACL synchronization feature, which downloads and tracks how Hive tables are mapped to HDFS locations.

It differs from Sentry in that it fully transparently supports all the functions represented by Ranger policies. Therefore, this implementation includes support for tag-based policies, security zones, masks and row filtering, and audit logging.

In addition, this feature can be enabled or disabled by simple configuration on the HDFS side, so that each installation can choose to turn it on or off.

Original link: https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/security-ranger-rms-configuring-and-using/topics/security-ranger-rms-overview.html

Configure high availability for Hive-HDFS ACL synchronization

Use the following steps to configure high availability for Ranger Resource Mapping Server (RMS) and Hive-HDFS ACL Sync.

In Cloudera Manager, select Ranger RMS, and then select Action > add role instance.

On the add role instance page, click Select Host.

"on the selected hosts page, select the backup Ranger RMS host." The Ranger RM (RR) icon appears in the added roles column of the selected host. Click OK to continue.

The new backup host redisplays the add role instance page. Click continue.

Review the settings on the View changes page, and then click continue.

The new role instance will be displayed on the Ranger RMS page.

In Cloudera Manager, select Ranger RMS, and then click Configuration.

Select the Ranger RMS Server High availability check box. In the Ranger RMS Server ID box, add a comma-separated ID list for each RMS server.

The add (+) icon of the Ranger RMS server advanced configuration snippet (relief valve) using the Ranger-rms-conf / ranger-rms-site.xml attribute adds entries for each RMS host and its corresponding server ID.

Ranger-rms.server.address.id1 =: 8383

Ranger-rms.server.address.id2 =: 8383

Be careful

If SSL is enabled, use port 8484.

Click Save changes, and then click the restart icon.

On the Stale Configurations page, click Restart Stale Services.

On the restart old services page, select the redeploy client configuration check box, and then click restart now.

When you restart the service, the progress indicator page is displayed. After the service is restarted, click finish.

Original link: https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/security-ranger-rms-configuring-and-using/topics/security-ranger-rms-configure-ha.html

Configure Hive-HDFS ACL synchronization

The Ranger Resource Mapping Server (RMS) should be fully configured after installation. This topic provides more information about RMS configuration settings and workflows.

Important configuration information

Ranger RMS enables HDFS access through the Ranger Hive policy. Ranger RMS must be configured with the name of the HDFS and the Hive service (AKA Repos). In your installation, multiple Ranger services may have been created for HDFS and Hive. These can be seen in Ranger Admin Web UI. RMS ACL synchronization is intended for a specific pair of HDFS and Hive Ranger services. Therefore, it is important to identify these service names before installing Ranger RMS. These names should be configured during the installation of Ranger RMS. The default value for the Ranger HDFS service name is cm_hdfs, while the default name for the Ranger Hive service is cm_hive.

Before starting the Ranger RMS installation, make sure that the Hive service identified in the above installation allows the rangerrms user select to access all databases and all tables in all security zones of the Hive service by default.

By default, Ranger RMS tracks only external tables in Hive. To configure Ranger RMS to also track managed Hive tables, add the following configuration settings to Ranger RMS.

Ranger-rms.HMS.map.managed.tables=true

In Cloudera Manager, select HDFS > configuration > HDFS service advanced configuration code snippet (relief valve) for ranger-hdfs-security.xml, and then confirm the following settings:

Ranger.plugin.hdfs.chained.services = cm_hive

Ranger.plugin.hdfs.chained.services.cm_hive.impl = org.apache.ranger.chainedplugin.hdfs.hive.RangerHdfsHiveChainedPlugin

Be careful

If you change any of these configurations after starting Ranger RMS and synchronizing with Hive Metastore, the only way for Ranger RMS to use the new configuration is to perform the following steps:

Stop Ranger RMS.

Log in to the Ranger RMS database and run delete from xroomrmsmapped mapping providers; to delete the unique rows from the table.

Start Ranger RMS.

After restarting, Ranger RMS resynchronizes all data in Hive Metastore. This can take a lot of time, depending on the number of Hive tables in the Hive Metastore.

Understand Ranger's RMS strategy

At a higher level, the Ranger RMS workflow is as follows:

The Ranger policy for the HDFS service will be evaluated. If any policy explicitly denies access, access is denied.

Ranger checks to see if the accessed location is mapped to the Hive table.

If so, the Hive policy is evaluated for the mapped Hive table. If there is a HDFS policy that allows access, access is allowed. Otherwise, the default HDFS ACL determines access.

The requested HDFS permissions map to Hive permissions, as follows:

HDFS 'read' = > Hive' select'HDFS 'write' = > Hive' update' or 'alter'HDFS' execute' = > any Hive permission

If there is no Hive policy that explicitly allows access to the mapping table, access is denied, otherwise access is allowed.

Appropriate marking policies are considered during the HDFS access evaluation and during the Hive access evaluation phase, if necessary. Similarly, one or more log records are generated to indicate which policy, if any, makes the access decision.

The following scenario illustrates how to determine access permissions. All scenarios assume that the Ranger HDFS policy does not explicitly deny access to the HDFS location.

Location does not correspond to Hive table.

In this case, access is granted only if the Ranger HDFS policy allows access or if the HDFS native ACL allows access. The audit log will show which policy (or Hadoop-acl) made the decision.

The location corresponds to an Hive table.

For any access derived from the original HDFS request, the Ranger Hive policy explicitly denies access to the mapping table. Access will be denied by Hive policy.

There is no matching Ranger Hive policy.

Access will be denied. The audit log will not specify a policy.

The cursor policy masks some columns in the mapping table.

Access will be denied. The audit log displays the Hive blocking policy.

Mapped Hive table with row filter policy

Access will be denied. The audit log displays the Hive line filter policy.

The Ranger Hive policy allows access to the mapping table for access derived from the original HDFS access request.

Access will be granted. If the access was originally granted by the HDFS policy, the audit log displays the HDFS policy.

Thank you for reading! This is the end of this article on "sample Analysis of Ranger Hive-HDFS ACL synchronization". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report