To perform a manual rolling upgrade, your cluster must meet the following prerequisites:
| Item | Description | 
| Cluster Stack Version | Must be running the HDP 2.2.0 Stack | 
| Cluster Target Version | All HDP nodes must have the HDP 2.2.4 installed alongside 2.2.0. (See "Preparing the Cluster" for installation instructions.) | 
| Services | All 2.2.0 services must be started and running. All previous upgrade operations must be finalized. | 
| HDFS | NameNode HA must be enabled and running with an active namenode and standby namenode. No components should be in decommissioning or decommissioned state. | 
| Hadoop, Hive, Oozie | Enable client retry for Hadoop, Hive, and Oozie. | 
| YARN | Enable Work Preserving Restart (WPR). (Optional) Enable YARN ResourceManager High Availability | 
| Shared client libraries | Shared client libraries must be loaded into HDFS. | 
| Hive, Tez | Confirm configuration settings for rolling upgrade. | 
The following paragraphs describe component prerequisites in more detail:
- Enable HDFS NameNode High Availability. See NameNode High Availability for Hadoop (in the Hadoop High Availability Guide) for more information. 
- Enable client retry properties for HDFS, Hive, and Oozie. These properties are not included by default, so you might need to add them to the site files. - For HDFS, set - dfs.client.retry.policy.enabledto true in- hdfs-site.xmlon all nodes with HDFS services.
- For Hive, specify - hive.metastore.failure.retriesand- hive.metastore.client.connect.retry.delayin hive-site.xml (for example,- /usr/hdp/2.2.0.0-2041/hive/conf/hive-site.xml). The default value for retries is 24; the default for retry delay is 5s.
- For Oozie, - export OOZIE_CLIENT_OPTS="${OOZIE_CLIENT_OPTS} -Doozie.connection.retry.count=<number of retries>"in- oozie-env.sh(for example,- /usr/hdp/2.2.0.0-2041/oozie/conf/oozie-env.sh). A typical value for number of retries is 5.
 
- Enable work-preserving ResourceManager/NodeManager restart in the - yarn-site.xmlfile for each node. For more information, see Work-Preserving Restart in the YARN Resource Management Guide.- Additional notes: - If - yarn.resourcemanager.work-preserving-recovery.scheduling-wait-msis set, the ResourceManager will wait for the specified number of milliseconds after each restart before accepting new jobs.
- After editing ResourceManager settings in the - yarn-site.xmlfile, restart the ResourceManager and all NodeManagers. (Changes will not take effect until you restart the processes.)
 
- (Optional) Enable YARN Resource Manager High Availability. Enabling RM HA will reduce the amount of service degradation while YARN is upgraded. If RM HA is not enabled, when the Resource Manager restarts your active jobs will pause and new job requests will wait to be scheduled. For more information, see Resource Manager High Availability for Hadoop. 
- To prevent disruption to MapReduce, Tez, and Oozie jobs, your existing jobs must reference the client libraries of the version they started with. Make sure shared client Hadoop libraries are available from distributed cache. This was probably set up during the HDP 2.2.0 installation process. For more information, see Running Multiple MapReduce Versions Using the YARN Distributed Cache in the YARN Resource Management Guide. 
- (Optional) When upgrading to HDP version 2.2.4 or later, remove the following two properties from the - hive-site.xmlconfiguration file (or set them to false):- fs.file.impl.disable.cache
- fs.hdfs.impl.disable.cache
 
- Make sure HiveServer2 is configured for rolling upgrade. Set or confirm the following server-side properties: - Set - hive.server2.support.dynamic.service.discoveryto- true
- Set - hive.zookeeper.quorumto a comma-separated list of ZooKeeper host:port pairs in the Zookeeper ensemble (e.g.- host1:port1, host2:port2, host3:port3). By default this value is blank.
- Add the - hive.zookeeper.session.timeoutproperty to the- hive-site.xmlfile (if necessary), and specify the length of time that ZooKeeper will wait to hear from HiveServer2 before closing the client connection. The default value is 60 seconds.
- Set - hive.server2.zookeeper.namespaceto the value for the root namespace on ZooKeeper. (The root namespace is the parent node in ZooKeeper used by HiveServer2 when supporting dynamic service discovery.) Each HiveServer2 instance with dynamic service discovery enabled will create a- znodewithin this namespace. The default value is- hiveserver2.- Note: you can specify the location of the - hive-site.xmlfile via a HiveServer2 startup command line- --configoption:- hive --config <my_config_path> --service hiveserver2
 - JDBC considerations: - The JDBC driver connects to ZooKeeper and selects a HiveServer2 instance at random. When a JDBC client tries to pick up a HiveServer2 instance via ZooKeeper, the following JDBC connection string should be used: - jdbc:hive2://<zookeeper_ensemble>/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=<hiveserver2_zookeeper_namespace>- where - <zookeeper_ensemble>is a comma separated list of ZooKeeper host:port pairs, as described in the- hive.zookeeper.quorumproperty.- <hiveserver2_zookeeper_namespace>is the namespace on ZooKeeper under which HiveServer2- znodesare added. This instance is then used by the connecting client for her entire session.
 
- Check the following two settings to make sure that Tez is configured for rolling upgrade: - tez.lib.uris(in the- tez-site.xmlfile) should contain only a single value pointing to a version-specific Tez tarball file. For 2.2 installations, the Tez app jars are in- /hdp/apps/${hp.version}.
- Set - tez.use.cluster.hadoop-libsto- false. (If true, the deployment will expect Hadoop jar files to be available on all nodes.
 
If Kerberos is enabled, it will continue to operate throughout the rolling upgrade process.


