1. Getting Ready to Upgrade

HDP Stack upgrade involves removing HDP 1.x MapReduce and replacing it with HDP 2.x Yarn and MapReduce2. Before you begin, review the upgrade process and complete the Backup steps.

Back up the following HDP 1.x directories:
- /etc/hadoop/conf
- /etc/hbase/conf
- /etc/hcatalog/conf
- /etc/hive/conf
- /etc/pig/conf
- /etc/sqoop/conf
- /etc/flume/conf
- /etc/mahout/conf
- /etc/oozie/conf
- /etc/hue/conf
- /etc/zookeeper/conf
- Optional - Back up your userlogs directories, ${mapred.local.dir}/userlogs.
Run the fsck command as the HDFS Service user and fix any errors. (The resulting file contains a complete block map of the file system.)
```
su $HDFS_USER
hadoop fsck / -files -blocks -locations > /tmp/dfs-old-fsck-1.log 
```
where $HDFS_USER is the HDFS Service user. For example, hdfs.
Use the following instructions to compare status before and after the upgrade:
Note
The following commands must be executed by the user running the HDFS service (by default, the user is hdfs).
1. Capture the complete namespace of the file system. (The following command does a recursive listing of the root file system.)
```
su $HDFS_USER
hadoop dfs -lsr / > dfs-old-lsr-1.log 
```
  where $HDFS_USER is the HDFS Service user. For example, hdfs.
2. Run the report command to create a list of DataNodes in the cluster.
```
su $HDFS_USER
hadoop dfsadmin -report > dfs-old-report-1.log
```
  where $HDFS_USER is the HDFS Service user. For example, hdfs.
3. Optional - You can copy all or unrecoverable only data stored in HDFS to a local file system or to a backup instance of HDFS.
4. Optional - You can also repeat the steps 3 (a) through 3 (c) and compare the results with the previous run to ensure the state of the file system remained unchanged.

As the HDFS user, save the namespace by executing the following command:

su $HDFS_USER
hadoop dfsadmin -safemode enter
hadoop dfsadmin -saveNamespace

Backup your NameNode metadata.
1. Copy the following checkpoint files into a backup directory:
  - dfs.name.dir/edits
  - dfs.name.dir/image/fsimage
  - dfs.name.dir/current/fsimage
2. Store the layoutVersion of the namenode.
  ${dfs.name.dir}/current/VERSION
Finalize the state of the filesystem.
```
su $HDFS_USER
hadoop namenode -finalize
```

Optional - Backup the Hive Metastore database.

	Note
	These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.

Table 17.1. Hive Metastore Database Backup and Rstore

Database Type	Backup	Restore
MySQL	`mysqldump $dbname > $outputfilename.sql` For example: `mysqldump hive > /tmp/mydir/backup_hive.sql`	`mysql $dbname < $inputfilename.sql` For example: `mysql hive < /tmp/mydir/backup_hive.sql`
Postgres	`sudo -u $username pg_dump $databasename > $outputfilename.sql`For example: `sudo -u postgres pg_dump hive > /tmp/mydir/backup_hive.sql`	`sudo -u $username psql $databasename < $inputfilename.sql` For example: `sudo -u postgres psql hive < /tmp/mydir/backup_hive.sql`
Oracle	Connect to the Oracle database using `sqlplus` export the database: `exp username/password@database full=yes file=output_file.dmp`	Import the database: `imp username/password@database ile=input_file.dmp`

Optional - Backup the Oozie Metastore database.

	Note
	These instructions are provided for your convenience. Please check your database documentation for the latest back up instructions.

Table 17.2. Oozie Metastore Database Backup and Restore

Database Type	Backup	Restore
MySQL	`mysqldump $dbname > $outputfilename.sql` For example: `mysqldump oozie > /tmp/mydir/backup_oozie.sql`	`mysql $dbname < $inputfilename.sql` For example: `mysql oozie < /tmp/mydir/backup_oozie.sql`
Postgres	`sudo -u $username pg_dump $databasename > $outputfilename.sql`For example: `sudo -u postgres pg_dump oozie > /tmp/mydir/backup_oozie.sql`	`sudo -u $username psql $databasename < $inputfilename.sql` For example: `sudo -u postgres psql oozie < /tmp/mydir/backup_oozie.sql`

Stop all services (including MapReduce) and client applications deployed on HDFS using the instructions provided here. 
Verify that edit logs in ${dfs.name.dir}/name/current/edits* are empty. These log files should have only 4 bytes of data, which contain the edit logs version. If the edit logs are not empty, start the existing version NameNode and then shut it down after a new fsimage has been written to disks so that the edit log becomes empty.

Legal notices

	Note
	The following commands must be executed by the user running the HDFS service (by default, the user is `hdfs`).