Configure the Spark Thrift server
Use the following steps to configure the Apache Spark Thrift server on a Kerberos-enabled cluster.
If you are installing the Spark Thrift server on a Kerberos-enabled cluster, note the following requirements:
-
The Spark Thrift server must run in the same host as
HiveServer2, so that it can access thehiveserver2keytab. -
Permissions in
/var/run/sparkand/var/log/sparkmust specify read/write permissions to the Hive service account. -
You must use the Hive service account to start the
thriftserverprocess.
If you access Hive warehouse files through HiveServer2 on a deployment with fine-grained
access control, run the Spark Thrift server as user hive. This ensures that
the Spark Thrift server can access Hive keytabs, the Hive metastore, and HDFS data stored
under user hive.
![]() | Important |
|---|---|
If you read files from HDFS directly through an interface such as the Spark CLI (as
opposed to HiveServer2 with fine-grained access control), you should use a different
service account for the Spark Thrift server. Configure the account so that it can access
Hive keytabs and the Hive metastore. Use of an alternate account provides a more secure
configuration: when the Spark Thrift server runs queries as user |
For Spark jobs that are not submitted through the Thrift server, the user submitting the
job must have access to the Hive metastore in secure mode, using the kinit
command.


