Known Issues
| Hortonworks Bug ID | Apache JIRA | Apache Component | Summary | |||
|---|---|---|---|---|---|---|
| TSB-465 | N/A | HBase | Corruption of HBase data stored with MOB feature For the more information on this issue, see the corresponding Knowledge article: TSB 2021-465: Corruption of HBase data stored with MOB feature on upgrade from CDH 5 and HDP 2 | |||
| BUG-38148 | ACCUMULO-4389 | Accumulo | Description of Problem: Apache Accumulo has a feature called "Replication" which automatically propagates updates to one table to a list of other Accumulo cluster instances. This feature is used for disaster-recovery scenarios allowing data-center level failover. With this replication feature, there are a number of client API methods which support developer interactions with the feature. The ReplicationOperations#drain(String, Set) method is intended to serve as a blocking call which waits for all of the provided write-ahead log files that need to be replicated to other peers. Sometimes, the method reportedly does not actually wait for a sufficient amount of time. Associated error message: No direct error message is generated; the primary symptom is when the configured Accumulo replication peers do not have all of the expected data from the source Accumulo cluster. Workaround: None at this time. Upstream fix: https://issues.apache.org/jira/browse/ACCUMULO-4389 has been opened to track this issue. | |||
| BUG-55799 | HIVE-12930 | Hive | Description of Problem: SSL shuffle for LLAP is not supported Workaround: Currently, there is no workaround. | |||
| BUG-57862 | N/A | Hive, Hive2 | Description of Problem: When Ranger authorization is enabled for Hive, users will be denied permission to create temporary UDFs. Workaround: To allow users to create temporary UDFs, create a Ranger policy in the following way: Resource: Database=*, udf=* Permissions: Create Users/Groups: <as needed> 
 | |||
| BUG-59714 | HIVE-13974 | Hive | Description of Problem: ORC Schema Evolution does not support adding columns to a STRUCT type column unless the STRUCT column is the last column. You can add column C to the last column last_struct: CREATE TABLE orc_last_struct ( str STRING, last_struct STRUCT<A:STRING,B:STRING> ) STORED AS ORC; ALTER TABLE orc_last_struct REPLACE columns (str STRING, last_struct STRUCT<A:STRING,B:STRING,C:BIGINT>); You will be able to read the table. However, in this table: CREATE TABLE orc_inner_struct ( str STRING, inner_struct STRUCT<A:STRING,B:STRING>, last DATE ) STORED AS ORC; ALTER TABLE orc_inner_struct REPLACE columns (str STRING, inner_struct STRUCT<A:STRING,B:STRING,C:BIGINT>, last DATE); You will not be able to read the table. You will get execution errors like:
                   Workaround: The workaround is not to use tables with Schema Evolution in inner STRUCT type columns. | |||
| BUG-60301 | TEZ-3502 | Tez | Description of Problem: The search/filter functionality in the Tez View does not work correctly when looking for DAGs submitted by users with user IDs that only contain numbers. Workaround: Currently, there is no known workaround. | |||
| BUG-60690 | KNOX-718 | Knox | Description of Problem: Unable to log in using Knox SSO even when providing correct credentials. This is because the whitelist is not correctly configured. The login page will not provide an error message to indicate a reason for the failed login. Associated error message: Found in Workaround: In
                 <param>
           <name>knoxsso.redirect.whitelist.regex</name>
           <value>.*;^/.*$;https?://localhost*$;^http.*$</value>
</param> | |||
| BUG-63132 | N/A | Storm | Summary: Solr bolt does not run in a Kerberos environment. Associated error message: The following is an example: [ERROR] Request to collection hadoop_logs failed due to (401) org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http:[...] Error 401 Authentication required Workaround: None at this time. | |||
| BUG-63165 | PHOENIX-3126 | Zeppelin | Description of problem: When Kerberos is enabled in the cluster, Kerberos-based user authentication in the Zeppelin UI is not correctly passed to Phoenix/HBase. The user credentials will be unavailable to Phoenix, resulting in standard HBase authentication/authorization schemes working as intended. Associated error message: Unexpected failed authentication and authorization messages from Zeppelin in talking to Phoenix/HBase. Workaround: There is no known workaround at this time. This issue will be addressed in a future maintenance release. | |||
| BUG-63885 | HIVE-14446 | Hive, Hive2 | Component Affected: ACID Description of Problem: Small tables estimated to have about 300 million rows that broadcast to a Mapjoin will cause the BloomFilter to overflow. Typically, this is due to bad stats estimation. Workaround: It is possible to avoid this issue with the following: set hive.mapjoin.hybridgrace.hashtable=false However, if this is caused by bad stats estimation and Hybrid grace hash join does not work, the regular mapjoin also will not work. | |||
| BUG-64098 | N/A | Spark | Description of Problem: When installing Spark manually on Debian/Ubuntu, the apt-get install spark command does not install all Spark packages. Workaround: Use the  | |||
| BUG-65028 | N/A | Zeppelin | Description of Problem: On secure clusters that run Zeppelin, configure settings to limit interpreter editing privileges to admin roles. Workaround: Add the following lines to the
                  [urls] section of the Zeppelin  /api/interpreter/** = authc, roles[admin] /api/configurations/** = authc, roles[admin] /api/credential/** = authc, roles[admin] | |||
| BUG-65058 | N/A | Ambari, Hive | Description of Problem: LLAP containers may end up getting killed due to insufficient memory available in the system. Associated Error Message: The following messages in the AM log of LLAP YARN Application. # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 194347270144 bytes for committing reserved memory. # An error report file with more information is saved as: Workaround: Reduce the YARN NodeManager available memory. This is defined as the Memory allocated for all YARN containers on a node under the YARN Configuration tab. Description of Problem: LLAP daemons can be killed by the YARN Memory Monitor Associated Error Message: The following messages in the AM log of LLAP YARN Application. is running beyond physical memory limits. Current usage: <USED> of <ALLOCATED> GB physical memory used Workaround: Lower the LLAP heap size under the Advanced hive-interactive-env section of the Advanced Hive config. 
 | |||
| BUG-65884 | HBASE-16270 | HBase | Description of Problem: HBase clusters running with "region replica" feature might run into a problem where the region flushes are blocked with exceptions similar to org.apache.hadoop.hbase.regionserver.UnexpectedStateException: Current snapshot id is -1,passed 1469085004304 Workaround:There is no workaround possible, and the cluster should be updated to an HDP version with the patch for HBASE-16270. | |||
| BUG-66078 | HIVE-15181 | Hive | Description of Problem: When more than 1000 transactions require a time out, the process for handling the time out may get stuck in an infinite loop. Workaround: Configure Hive in the following way: set hive.direct.sql.max.query.length=1; set hive.direct.sql.max.elements.in.clause=1000; 
 | |||
| BUG-66325, BUG-66326 | N/A | Zeppelin | Description of Problem: Zeppelin (with or without Livy) cannot access data on encrypted (TDE) clusters when the default user settings are in effect. Workaround: 
 | |||
| BUG-66651 | HDFS-4176 | HDFS | Description of Problem: This standby NameNode can potentially fail to become active when the active NameNode process is frozen (but not actually crashed). Workaround: Currently, there is no known workaround. | |||
| BUG-68049 | HDFS-10797 | HDFS | Description of Problem: Disk usage summary incorrectly counts files twice if they have been renamed since being snapshotted. Workaround: Currently, there is no known workaround. | |||
| BUG-68077 | HDFS-10301 | HDFS | Description of Problem: The NameNode can incorrectly conclude that some DataNode storage directories are missing and remove them. This can lead to missing blocks. When this problem is hit you may see a number of "removing zombie storage" messages in the NameNode log files. Workaround: Currently, there is no known workaround. | |||
| BUG-69158 | N/A | Zeppelin, Spark | Description of Problem: By default, the Livy server times out after being idle for 60 minutes. Associated error message: Subsequent attempts to access Livy generate an error, Exception: Session not found, Livy server would have restarted, or lost session. Workaround: Set the timeout to a larger value through the property livy.server.session.timeout, and restart the Zeppelin Livy interpreter. | |||
| BUG-77311 | N/A | Zeppelin | Description of Problem: When one user restarts the %livy interpreter from the Interpreters (admin) page, other users' sessions restart too. Workaround: Restart the %livy interpreter from within a notebook. | |||
| BUG-78237 | ATLAS-1741 | Atlas | Description of Problem: Apache Atlas uses reflection to introspect the StormTopology fields and gather metadata. This introspection leads to RuntimeException or NPE when a Storm topology is submitted, occuring when the reflection APIs recurse into the Jackson library. Workaround: Currently, there is no known workaround. | |||
| BUG-79291 | N/A | Ambari, Falcon | Description of Problem: Falcon is not starting after HDP upgrade (2.5.0|2.5.3 to 2.5.5) using Ambari. Workaround: See the workaround here for more information. | |||
| BUG-79480 | HIVE-16385 | Hive, Hive2 | Description of Problem: For a partitioned
                table, the class  Workaround: Currently, there is no known workaround. | |||
| BUG-80644 | PHOENIX-3708 | Phoenix | Description of Problem: When using the Phoenix-Hive storage handler to issue Phoenix queries from Hive, the user may experience queries that fail because of extra quotation marks being placed around column names in the SQL statement. These extra quotation marks cause the Phoenix query to fail. Workaround: Currently, there is no known workaround. | |||
| BUG-80901 | N/A | Zeppelin | Component Affected: Zeppelin/Livy Description of Problem: This occurs when running applications through Zeppelin/Livy that requires some 3rd-party libraries. These libraries cannot be installed on all nodes in the cluster but they are installed on their edge nodes. Running in yarn-client mode this all works as the job is submitted on the edge node where the libraries are installed and runs there. In yarn-cluster mode, it fails because the libraries are missing. Workaround: Set either
                  spark.jars in  | |||
| BUG-82159 | HBASE-18035 | HBase | Description of Problem: When a client is configured to use meta replica, it sends scan requests to all meta replicas almost simultaneously. Since meta replica contains stale data, this may cause a mix up in the region locations returned to the client. Workaround: To fix this, a client will always send to primary meta region first and wait for a reply a configured amount of time. If it does not receive a result, it will send requests to replica meta regions. | |||
| BUG-82261 | YARN-6339 | YARN | Description of Problem: Calling
                   Workaround: Set the
                   | |||
| BUG-82902 | HIVE-16399 | Hive | Description of Problem: When running transactional workload (esp streaming ingest api) with Oracle backed Hive metastore it's possible to see Deadlock exceptions from the DB. Associated error message: In metastore logs you will see this error message from Oracle: java.sql.SQLException: ORA-00060: deadlock detected while waiting for resource Workaround: Create an index on TXN_COMPONENTS table on TC_TXNID column: CREATE INDEX TC_TXNID_INDEX ON TXN_COMPONENTS (TC_TXNID); | |||
| BUG-82963 | N/A | Ambari, Oozie | Description of Problem: If a cluster is installed without Falcon, the service check will fail for Oozie during the upgrade. After upgrading to HDP 2.5.6, Oozie does not start. Workaround: After upgrading, delete the Falcon
                EL extension configurations from  | |||
| BUG-82970 | AMBARI-21297 | Ambari | Description of Problem: NFSGateway fails to start on Centos7 during rolling upgrade from HDP 2.5.3 to HDP 2.5.6 Workaround: Updating the libtirpc package fixes this issue. On each node in the cluster, run the following command: yum upgrade libtirpc | 
| Technical Service Bulletin | Apache JIRA | Apache Component | Summary | 
|---|---|---|---|
| TSB-405 | N/A | N/A | Impact of LDAP Channel Binding and LDAP signing changes in Microsoft Active Directory Microsoft has introduced changes in LDAP Signing and LDAP Channel Binding to increase the security for communications between LDAP clients and Active Directory domain controllers. These optional changes will have an impact on how 3rd party products integrate with Active Directory using the LDAP protocol. Workaround Disable LDAP Signing and LDAP Channel Binding features in Microsoft Active Directory if they are enabled For more information on this issue, see the corresponding Knowledge article: TSB-2021 405: Impact of LDAP Channel Binding and LDAP signing changes in Microsoft Active Directory | 
| TSB-406 | N/A | HDFS | CVE-2020-9492 Hadoop filesystem bindings (ie: webhdfs) allows credential stealing WebHDFS clients might send SPNEGO authorization header to remote URL without proper verification. A maliciously crafted request can trigger services to send server credentials to a webhdfs path (ie: webhdfs://…) for capturing the service principal For more information on this issue, see the corresponding Knowledge article: TSB-2021 406: CVE-2020-9492 Hadoop filesystem bindings (ie: webhdfs) allows credential stealing | 
| TSB-434 | HADOOP-17208, HADOOP-17304 | Hadoop | KMS Load Balancing Provider Fails to invalidate Cache on Key Delete For more information on this issue, see the corresponding Knowledge article: TSB 2020-434: KMS Load Balancing Provider Fails to invalidate Cache on Key Delete | 
| TSB-465 | N/A | HBase | Corruption of HBase data stored with MOB feature For more information on this issue, see the corresponding Knowledge article: TSB 2021-465: Corruption of HBase data stored with MOB feature on upgrade from CDH 5 and HDP 2 | 
| TSB-497 | N/A | Solr | CVE-2021-27905: Apache Solr SSRF vulnerability with the Replication handler The Apache Solr ReplicationHandler (normally registered at "/replication" under a Solr core) has a "masterUrl" (also "leaderUrl" alias) parameter. The “masterUrl” parameter is used to designate another ReplicationHandler on another Solr core to replicate index data into the local core. To help prevent the CVE-2021-27905 SSRF vulnerability, Solr should check these parameters against a similar configuration used for the "shards" parameter. For more information on this issue, see the corresponding Knowledge article: TSB 2021-497: CVE-2021-27905: Apache Solr SSRF vulnerability with the Replication handler | 
| TSB-512 | N/A | HBase | HBase MOB data loss HBase tables with the MOB feature enabled may encounter problems which result in data loss. For more information on this issue, see the corresponding Knowledge article: TSB 2021-512: HBase MOB data loss | 


![[Caution]](../common/images/admon/caution.png)
![[Note]](../common/images/admon/note.png)