DataSunrise Achieves AWS DevOps Competency Status in AWS DevSecOps and Monitoring, Logging, Performance

Apache Hive RBAC Configuration with SQL

Introduction

This guide addresses common Apache Hive Role-Based Access Control (RBAC) configuration with SQL issues, specifically focusing on the challenges encountered when setting up admin roles and permissions with SQL queries. We'll walk through a real-world example of troubleshooting and resolving these issues in a Docker-based Hive environment.

Understanding the Problem

Common Error Messages

When attempting to configure RBAC in Hive with queries like:

SHOW ROLES;
SET ROLE admin;
CREATE ROLE test_role;
GRANT ROLE test_role TO USER tester;

For example, for role creation query you might encounter various error messages depending on your connection method:

JDBC Connection (e.g., DBeaver)

SQL Error [1] [08S01]: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
Current user : root is not allowed to add roles. User has to belong to ADMIN role and have it as current role, for this action.

Hive CLI (e.g. beeline or hive -e)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. 
Failed to retrieve roles for null: Required field 'principal_name' is unset! 
Struct:GetRoleGrantsForPrincipalRequest(principal_name:null, principal_type:USER)

Root Causes

The issues typically stem from:

  1. Incomplete authentication configuration
  2. Incorrect authorization provider settings
  3. Missing user-to-role mappings
  4. Improper service account permissions

Environment Setup for Apache Hive RBAC Configuration with SQL

Prerequisites

Before proceeding, ensure you have:

  • Administrative access to your Hive environment
  • Ability to modify Hive configuration files
  • Access to restart Hive services
  • Basic understanding of XML configuration files

Locating Configuration Files

First, locate your Hive configuration file hive-site.xml directory. You can run these commands to check the common locations:

ls /etc/hive/conf/hive-site.xml
ls /etc/hadoop/conf/hive-site.xml
ls /usr/lib/hive/conf/hive-site.xml
ls /opt/hive/conf/hive-site.xml
ls $HIVE_HOME/conf/hive-site.xml

Or run this command to find the correct location:

find / -name "hive-site.xml" 2>/dev/null

File Permission Requirements

Ensure proper file permissions:

ls -l /opt/hive/conf/hive-site.xml
# Should show something like:
# -rw-r--r-- 1 root root 3342 Jan 31 16:04 /opt/hive/conf/hive-site.xml

Step-by-Step Solution for Apache Hive RBAC Configuration with SQL

1. Backup Existing Configuration

Always create a backup before making changes:

cp /opt/hive/conf/hive-site.xml /opt/hive/conf/hive-site.xml.backup

2. Update hive-site.xml

Create a new configuration file with all necessary settings:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- Metastore Connection -->
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://hive-metastore:9083</value>
    </property>

    <!-- Database Configuration -->
    <property>
        <name>datanucleus.autoCreateSchema</name>
        <value>false</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:postgresql://hive-metastore-postgresql/metastore</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>org.postgresql.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
    </property>

    <!-- Authentication & Authorization Configuration -->
    <property>
        <name>hive.security.authorization.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.server2.enable.doAs</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.users.in.admin.role</name>
        <value>root</value>
    </property>
    <property>
        <name>hive.server2.authentication</name>
        <value>NONE</value>
    </property>
    <property>
        <name>hive.security.authorization.manager</name>
        <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
    </property>
    <property>
        <name>hive.metastore.pre.event.listeners</name>
        <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
    </property>
    <property>
        <name>hive.security.metastore.authorization.manager</name>
        <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
    </property>
    <property>
        <name>hive.security.authenticator.manager</name>
        <value>org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator</value>
    </property>
    <property>
        <name>hive.metastore.execute.setugi</name>
        <value>true</value>
    </property>
</configuration>

3. Restart HiveServer2

# Find hiveserver2 location
which hiveserver2
# Stop the service
hiveserver2 stop
# Wait for complete shutdown
sleep 5
# Start the service
hiveserver2 start
# Wait for startup
sleep 10

Testing and Verification for Apache Hive RBAC Configuration with SQL

1. Verify Service Status

ps aux | grep hiveserver2
netstat -tulpn | grep 10000

2. Test RBAC Configuration

Using DBeaver or another JDBC client/connection:

SHOW ROLES;
SET ROLE admin;
CREATE ROLE user_role;

GRANT ROLE test_role TO USER tester;

Using hive -e/beeline:

Apache Hive RBAC Configuration with SQL Troubleshooting

Common Issues and Solutions

1. Principal Name Null Error If you see:

Required field 'principal_name' is unset!

Solution: Verify the hive.security.authenticator.manager setting is correct and HiveServer2 has been restarted.

2. User Not in Admin Role If you see:

root doesn't belong to role admin

Solution: Check the hive.users.in.admin.role property and ensure it contains your username.

3. Configuration Not Taking Effect

Solution:

  • Verify file permissions
  • Confirm HiveServer2 restart
  • Check logs for startup errors

Advanced Configuration for Apache Hive RBAC

Custom Authentication Providers

For environments requiring custom authentication:

<property>
    <name>hive.security.authenticator.manager</name>
    <value>com.your.custom.AuthenticatorManager</value>
</property>

Multiple Admin Users

To configure multiple admin users:

<property>
    <name>hive.users.in.admin.role</name>
    <value>root,admin1,admin2</value>
</property>

Further Considerations

  1. Security Best Practices

    • Regularly rotate passwords
    • Implement proper audit logging
    • Use SSL/TLS for connections
  2. Performance Impact

    • Monitor query performance after enabling RBAC
    • Adjust memory settings if needed
  3. Maintenance

    • Regular backup of configuration files
    • Document all custom settings
    • Maintain user-role mapping documentation

DataSunrise Integration for Apache Hive:
Advanced Solution for Simplified RBAC, Security & Compliance

While the native Hive RBAC configuration provides basic access control capabilities, enterprise environments often require more robust security, compliance, and auditing features. DataSunrise offers comprehensive integration with Apache Hive that extends these capabilities:

Key Features

Enhanced RBAC Management

Dynamic Data Protection

Compliance and Audit

Security Features

Advanced Capabilities

DataSunrise provides a comprehensive feature-rich solution for organizations requiring enterprise-grade security and compliance features, that builds upon and enhances Hive's native RBAC capabilities. Explore the supported Apache Hive features, or experience it firsthand by scheduling a demo to see DataSunrise in action.

References

  1. Apache Hive Security Documentation
  2. Storage Based Authorization in the Metastore Server
  3. Setting Up Hive Authorization
  4. SQL Standard Based Hive Authorization
  5. Language Manual Authorization

This guide is based on real-world experience with Apache Hive 2.3.2 in a Docker environment. Your specific environment might require different adjustments to these configurations.

Next

Data Masking for Apache Impala

Learn More

Need Our Support Team Help?

Our experts will be glad to answer your questions.

General information:
[email protected]
Customer Service and Technical Support:
support.datasunrise.com
Partnership and Alliance Inquiries:
[email protected]