Hive
Recommendation: Store Hive data in an HDFS path
                    called /apps/hive.
Configuring Hive Tables for HDFS Encryption
Before enabling encryption zones, decide whether to store your Hive tables across one zone or multiple encryption zones.
Single Encryption Zone
To configure a single encryption zone for your entire Hive warehouse:
- Rename - /apps/hiveto- /apps/hive-old
- Create an encryption zone at - /apps/hive
- distcpall of the data from- /apps/hive-oldto- /apps/hive.
To configure the Hive scratch directory
                        (hive.exec.scratchdir) so that it resides inside the encryption
                    zone:
- Set the directory to - /apps/hive/tmp.
- Make sure that the permissions for - /apps/hive/tmpare set to- 1777.
Multiple Encryption Zones
To access encrypted databases and tables with different encryption keys, configure multiple encryption zones.
For example, to configure two encrypted tables, ez1.db and
                        ez2.db, in two different encryption zones:
- Create two new encryption zones, - /apps/hive/warehouse/ez1.dband- /apps/hive/warehouse/ez2.db.
- Load data into Hive tables - ez1.dband- ez2.dbas usual, using- LOADstatements. (For additional considerations, see "Loading Data into an Encrypted Table.")
Loading Data into an Encrypted Table
By design, HDFS-encrypted files cannot be moved or loaded from one encryption zone into another encryption zone, or from an encryption zone into an unencrypted directory. Encrypted files can only be copied.
Within an encryption zone, files can be copied, moved, loaded, and renamed.
Recommendations:
- When loading unencrypted data into encrypted tables (e.g., - LOAD DATA INPATH), we recommend placing the source data (to be encrypted) into a landing zone within the destination encryption zone.
- An attempt to load data from one encryption zone into another will result in a copy operation. - Distcpwill be used to speed up the process if the size of the files being copied is higher than the value specified by the- hive.exec.copyfile.maxsizeproperty. The default limit is 32 MB.
Here are two approaches for loading unencrypted data into an encrypted table:
- To load unencrypted data into an encrypted table, use the - LOAD DATA ...statement.- If the source data does not reside inside the encryption zone, the - LOADstatement will result in a copy. If your data is already inside HDFS, though, you can use- distcpto speed up the copying process.
- If the data is already inside a Hive table, create a new table with a - LOCATIONinside an encryption zone, as follows:- CREATE TABLE encrypted_table [STORED AS] LOCATION ... AS SELECT * FROM <unencrypted_table>![[Note]](../common/images/admon/note.png) - Note - The location specified in the - CREATE TABLEstatement must be within an encryption zone. If you create a table that points- LOCATIONto an unencrypted directory, your data will not be encrypted. You must copy your data to an encryption zone, and then point- LOCATIONto that encryption zone.
If your source data is already encrypted, use the CREATE TABLE
                    statement. Point LOCATION to the encrypted source directory where
                    your data resides:
CREATE TABLE encrypted_table [STORED AS] LOCATION ... AS SELECT
                                * FROM <encrypted_source_directory>
This is the fastest way to create encrypted tables.
Encrypting Other Hive Directories
- LOCALSCRATCHDIR: The MapJoin optimization in Hive writes HDFS tables to a local directory and then uploads them to distributed cache. To enable encryption, either disable MapJoin (set- hive.auto.convert.jointo- false) or encrypt the- localHive Scratch directory (- hive.exec.local.scratchdir). Performance note: disabling MapJoin will result in slower join performance.
- DOWNLOADED_RESOURCES_DIR: Jars that are added to a user session and stored in HDFS are downloaded to- hive.downloaded.resources.dir. If you want these Jar files to be encrypted, configure- hive.downloaded.resources.dirto be part of an encryption zone. This directory needs to be accessible to the HiveServer2.
- NodeManager Local Directory List: Hive stores Jars and MapJoin files in the distributed cache, so if you'd like to use MapJoin or encrypt Jars and other resource files, the YARN configuration property NodeManager Local Directory List ( - yarn.nodemanager.local-dirs) must be configured to a set of encrypted local directories on all nodes.- Alternatively, to disable MapJoin, set - hive.auto.convert.jointo- false.
Additional Changes in Behavior with HDFS-Encrypted Tables
- Users reading data from read-only encrypted tables must have access to a temp directory that is encrypted with at least as strong encryption as the table. 
- By default, temp data related to HDFS encryption is written to a staging directory identified by the - hive-exec.stagingdirproperty created in the- hive-site.xmlfile? associated with the table folder.
- Previously, an - INSERT OVERWRITEon a partitioned table inherited permissions for new data from the existing partition directory. With encryption enabled, permissions are inherited from the table.
- When using encryption with Trash enabled, table deletion operates differently than the default trash mechanism. For more information see Delete Files from an Encryption Zone. 

