Hadoop
HDP 2.3.4.7 provided the following Apache patches:
HDP 2.3.4 provided the following Apache patches:
- HADOOP-11098: [JDK8] Max Non Heap Memory default changed between JDK7 and 8. 
- HADOOP-11628: SPNEGO auth does not work with CNAMEs in JDK8. 
- HADOOP-11685: StorageException complaining "no lease ID" during HBase distributed log splitting. 
- HADOOP-11918: Listing an empty s3a root directory throws FileNotFound. 
- HADOOP-11932: MetricsSinkAdapter may hang when being stopped. 
- HADOOP-12049 Control http authentication cookie persistence via configuration. 
- HADOOP-12089: StorageException complaining " no lease ID" when updating FolderLastModifiedTime in WASB. 
- HADOOP-12186 ActiveStandbyElector shouldn't call monitorLockNodeAsync multiple times. 
- HADOOP-12239: StorageException complaining " no lease ID" when updating FolderLastModifiedTime in WASB. 
- HADOOP-12324: Better exception reporting in SaslPlainServer. 
- HADOOP-12334: Change Mode Of Copy Operation of HBase WAL Archiving to bypass Azure Storage Throttling after retries. 
- HADOOP-12350: WASB Logging; Improve WASB Logging around deletes, reads, and writes. 
- HADOOP-12350 WASB Logging: Improve WASB Logging around deletes, reads and writes. 
- HADOOP-12407: Test failing; hadoop.ipc.TestSaslRPC. 
- HADOOP-12413: AccessControlList should avoid calling getGroupNames in isUserInList with empty groups. 
- HADOOP-12437 Allow SecurityUtil to lookup alternate hostnames. 
- HADOOP-12438: TestLocalFileSystem tests can fail on Windows after HDFS-8767 fix for handling pipe. 
- HADOOP-12440: TestRPC#testRPCServerShutdown did not produce the desired thread states before shutting down. 
- HADOOP-12441: Fixed kill-command behavior to work correctly across OSes by using bash shell built-in. 
- HADOOP-12463 Fix TestShell.testGetSignalKillCommand failure on windows. 
- HADOOP-12484: Single File Rename Throws Incorrectly In Potential Race Condition Scenarios. 
- HADOOP-12508: delete fails with exception when lease is held on blob. 
- HADOOP-12533: Introduce FileNotFoundException in WASB for read and seek API. 
- HADOOP-12540: TestAzureFileSystemInstrumentation#testClientErrorMetrics fails intermittently due to assumption that a lease error will be thrown. 
- HADOOP-12542: TestDNS fails on Windows after HADOOP-12437. 
- HADOOP-12577 Bump up commons-collections version to 3.2.2 to address a security flaw. 
- HADOOP-12617 SPNEGO authentication request to non-default realm gets default realm name inserted in target server principal. 
- HBASE-268 Rack locality improvement. 
- HDFS-4015: Safemode should count and report orphaned blocks. 
- HDFS-4015 Safemode should count and report orphaned blocks. 
- HDFS-4366 Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks. 
- HDFS-4937 ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom. 
- HDFS-6481 DatanodeManager#getDatanodeStorageInfos() should check the length of storageIDs. 
- HDFS-6581 Support for writing to single replica in RAM. 
- HDFS-7390 Provide JMX metrics per storage type. 
- HDFS-7483 Display information per tier on the Namenode UI. 
- HDFS-7725 Incorrect "nodes in service" metrics caused all writes to fail. 
- HDFS-7858 Improve HA Namenode Failover detection on the client. 
- HDFS-7928: Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy. 
- HDFS-7928 Scanning blocks from disk during rolling upgrade startup takes a lot of time if disks are busy. 
- HDFS-8099 Change "DFSInputStream has been closed already" message to debug log level. 
- HDFS-8209 Support different number of datanode directories in MiniDFSCluster. 
- HDFS-8554: TestDatanodeLayoutUpgrade fails on Windows. 
- HDFS-8656: Preserve compatibility of ClientProtocol#rollingUpgrade after finalization. 
- HDFS-8696: Make the lower and higher watermark in the DN Netty server configurable. 
- HDFS-8778: TestBlockReportRateLimiting#testLeaseExpiration can deadlock. 
- HDFS-8785 TestDistributedFileSystem is failing in trunk. 
- HDFS-8809: HDFS fsck reports under construction blocks as CORRUPT. 
- HDFS-8829 Make SO_RCVBUF and SO_SNDBUF size configurable for DataTransferProtocol sockets and allow configuring auto-tuning. 
- HDFS-8846: Add a unit test for INotify functionality across a layout version upgrade. 
- HDFS-8855: Webhdfs client leaks active NameNode connections. 
- HDFS-8930: Block report lease may leak if the 2nd full block report comes when NN is still in safemode. 
- HDFS-8950 NameNode refresh doesn't remove DataNodes that are no longer in the allowed list. 
- HDFS-8965: Harden edit log reading code against out of memory errors. 
- HDFS-8965 Harden edit log reading code against out of memory errors. 
- HDFS-8969: Clean up findbugs warnings for HDFS-8823 and HDFS-8932. 
- HDFS-9008: Balancer#Parameters class could use a builder pattern. 
- HDFS-9019: Adding informative message to sticky bit permission denied exception. 
- HDFS-9063: Correctly handle snapshot path for getContentSummary. 
- HDFS-9082: Change the log level in WebHdfsFileSystem.initialize() from INFO to DEBUG. 
- HDFS-9083: Replication violates block placement policy. 
- HDFS-9107: Prevent NNs unrecoverable death spiral after full GC. 
- HDFS-9112: Improve error message for Haadmin when multiple name service IDs are configured. 
- HDFS-9112 Improve error message for Haadmin when multiple name service IDs are configured, 
- HDFS-9128: TestWebHdfsFileContextMainOperations and TestSWebHdfsFileContextMainOperations fail due to invalid HDFS path on Windows. 
- HDFS-9142: Separating Configuration object for namenode(s) in MiniDFSCluster. 
- HDFS-9175: Change scope of 'AccessTokenProvider.getAccessToken()' and 'CredentialBasedAccessTokenProvider.getCredential()' abstract methods to public. 
- HDFS-9178 Slow datanode I/O can cause a wrong node to be marked bad. 
- HDFS-9184 Logging HDFS operation's caller context into audit logs. 
- HDFS-9205: Do not schedule corrupt blocks for replication. 
- HDFS-9220: Reading small file greater than 512 bytes that is open for append fails due to incorrect checksum. 
- HDFS-9273: ACLs on root directory may be lost after NN restart. 
- HDFS-9294: DFSClient deadlock when close file and failed to renew lease. 
- HDFS-9305 Delayed heartbeat processing causes storm of subsequent heartbeats. 
- HDFS-9311: Support optional offload of NameNode HA service health checks to a separate RPC server. 
- HDFS-9343 Empty caller context considered invalid. 
- HDFS-9354: Fix TestBalancer#testBalancerWithZeroThreadsForMove on Windows. 
- HDFS-9362: TestAuditLogger#testAuditLoggerWithCallContext assumes Unix line endings, fails on Windows. 
- HDFS-9364 Unnecessary DNS resolution attempts when creating NameNodeProxies. 
- HDFS-9384: TestWebHdfsContentLength intermittently hangs and fails due to TCP conversation mismatch between client and server. 
- HDFS-9397 Fix typo for readChecksum() LOG.warn in BlockSender.java. 
- HDFS-9413: getContentSummary() on standby should throw StandbyException. 
- HDFS-9426: Rollingupgrade finalization is not backward compatible. 
- HDFS-9434: Recommission a datanode with 500k blocks may pause NN for 30 seconds for printing info log messages. 
- MAPREDUCE-5485 Allow repeating job commit by extending OutputCommitter API. 
- MAPREDUCE-6273 HistoryFileManager should check whether summaryFile exists to avoid FileNotFoundException causing HistoryFileInfo into MOVE_FAILED state. 
- MAPREDUCE-6302 Backport preempt reducers after a configurable timeout irrespective of headroom. 
- MAPREDUCE-6549 Multibyte delimiters with LineRecordReader cause duplicate records. 
- YARN-2194 Fix bug causing CGroups functionality to fail on RHEL7. 
- YARN-2571 RM to support YARN registry. 
- YARN-3467 Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI. 
- YARN-3600 AM container link is broken (on a killed application, at least). 
- YARN-3727 For better error recovery, check if the directory exists before using it for localization. 
- YARN-3751 Fixed AppInfo to check if used resources are null. 
- YARN-3766 Fixed the apps table column error of generic history web UI. 
- YARN-3849 Too much of preemption activity causing continuous killing of containers across queues. 
- YARN-4140 RM container allocation delayed in case of app submitted to Nodelabel partition. 
- YARN-4233 YARN Timeline Service plugin: ATS v1.5. 
- YARN-4285 Display resource usage as percentage of queue and cluster in the RM UI. 
- YARN-4287 Rack locality improvement. 
- YARN-4288 Fixed RMProxy to retry on IOException from local host. 
- YARN-4313 Race condition in MiniMRYarnCluster when getting history server address. 
- YARN-4345 yarn rmadmin -updateNodeResource doesn't work. 
- YARN-4347 Resource manager fails with Null pointer exception. 
- YARN-4349 YARN_APPLICATION call to ATS does not have YARN_APPLICATION_CALLER_CONTEXT. 
- YARN-4384 updateNodeResource CLI should not accept negative values for resource. 
- YARN-4405 Support node label store in non-appendable file system. 
HDP 2.3.2 provided the following Apache patches:
NEW FEATURES
- HDFS-8155 Support OAuth2 in WebHDFS. 
IMPROVEMENTS
- HADOOP-10597 RPC Server signals backoff to clients when all request queues are full. 
- HADOOP-11960 Enable Azure-Storage Client Side logging. 
- HADOOP-12325 RPC Metrics: Add the ability track and log slow RPCs. 
- HADOOP-12358 Add -safely flag to rm to prompt when deleting many files. 
- HDFS-4185 Add a metric for number of active leases. 
- HDFS-4396 Add START_MSG/SHUTDOWN_MSG for ZKFC. 
- HDFS-6860 BlockStateChange logs are too noisy. 
- HDFS-7923 The DataNodes should rate-limit their full block reports byasking the NN on heartbeat messages. 
- HDFS-8046 Allow better control of getContentSummary. 
- HDFS-8180 AbstractFileSystem Implementation for WebHdfs. 
- HDFS-8278 When computing max-size-to-move in Balancer, count only the storage with remaining >= default block size. 
- HDFS-8432 Introduce a minimum compatible layout version to allow downgrade in more rolling upgrade use cases. 
- HDFS-8435 Support CreateFlag in WebHDFS. 
- HDFS-8549 Abort the balancer if an upgrade is in progress. 
- HDFS-8797 WebHdfsFileSystem creates too many connections for pread. 
- HDFS-8818 Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. 
- HDFS-8824 Do not use small blocks for balancing the cluster. 
- HDFS-8826 In Balancer, add an option to specify the source node list so that balancer only selects blocks to move from those nodes. 
- HDFS-8883 NameNode Metrics: Add FSNameSystem lock Queue Length. 
- HDFS-8911 NameNode Metric Add Editlog counters as a JMX metric. 
- HDFS-8983 NameNode support for protected directories. 
- HDFS-8983 NameNode support for protected directories. 
- YARN-2513 Host framework UIs in YARN for use with the ATS. 
- YARN-3197 Confusing log generated by CapacityScheduler. 
- YARN-3357 Move TestFifoScheduler to FIFO package. 
- YARN-3360 Add JMX metrics to TimelineDataManager. 
- YARN-3579 CommonNodeLabelsManager should support NodeLabel instead of string label name when getting node-to-label/label-to-label mappings. 
- YARN-3978 Configurably turn off the saving of container info in Generic AHS. 
- YARN-4082 Container shouldn't be killed when node's label updated. 
- YARN-4101 RM should print alert messages if ZooKeeper and Resourcemanager gets connection issue. 
- YARN-4149 yarn logs -am should provide an option to fetch all the log files. 
BUG FIXES
- HADOOP-11802 DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm. 
- HADOOP-12052 IPC client downgrades all exception types to IOE, breaks callers trying to use them. 
- HADOOP-12073 Azure FileSystem PageBlobInputStream does not return -1 onEOF. 
- HADOOP-12095 org.apache.hadoop.fs.shell.TestCount fails. 
- HADOOP-12304 Applications using FileContext fail with the default filesystem configured to be wasb/s3/etc. 
- HADOOP-8151 Error handling in snappy decompressor throws invalidexceptions. 
- HDFS-6945 BlockManager should remove a block from excessReplicateMap anddecrement ExcessBlocks metric when the block is removed. 
- HDFS-7608 hdfs dfsclient newConnectedPeer has nowrite timeout. 
- HDFS-7609 Avoid retry cache collision when Standby NameNode loading edits. 
- HDFS-8309 Skip unit test using DataNodeTestUtils#injectDataDirFailure() on Windows. 
- HDFS-8310 Fix TestCLI.testAll "help for find" on Windows. 
- HDFS-8311 DataStreamer.transfer() should timeout the socket InputStream. 
- HDFS-8384 Allow NN to startup if there are files having a lease but are notunder construction. 
- HDFS-8431 hdfs crypto class not found in Windows. 
- HDFS-8539 Hdfs doesn’t have class 'debug' in windows. 
- HDFS-8542 WebHDFS getHomeDirectory behavior does not match specification. 
- HDFS-8593 Calculation of effective layout version mishandles comparison to current layout version in storage. 
- HDFS-8767 RawLocalFileSystem.listStatus() returns null for UNIX pipefile. 
- HDFS-8850 VolumeScanner thread exits with exception if there is no blockpool to be scanned but there are suspicious blocks. 
- HDFS-8863 The remaining space check in BlockPlacementPolicyDefault is flawed. 
- HDFS-8879 Quota by storage type usage incorrectly initialized upon namenoderestart. 
- HDFS-8885 ByteRangeInputStream used in webhdfs does not overrideavailable(). 
- HDFS-8932 NPE thrown in NameNode when try to get TotalSyncCount metricbefore editLogStream initialization. 
- HDFS-8939 Test(S)WebHdfsFileContextMainOperations failing on branch-2. 
- HDFS-8969 Clean up findbugs warnings for HDFS-8823 and HDFS-8932. 
- HDFS-8995 Flaw in registration bookkeeping can make DN die on reconnect. 
- HDFS-9009 Send metrics logs to NullAppender by default. 
- YARN-3413 Changed Nodelabel attributes (like exclusivity) to be settable only via addToClusterNodeLabelsbut not changeable at runtime. 
- YARN-3885 ProportionalCapacityPreemptionPolicy doesn't preempt if queue is more than 2 level. 
- YARN-3894 RM startup should fail for wrong CS xml NodeLabel capacity configuration. 
- YARN-3896 RMNode transitioned from RUNNING to REBOOTED because its response idhas not been reset synchronously. 
- YARN-3932 SchedulerApplicationAttempt#getResourceUsageReport and UserInfo should based on total-used-resources. 
- YARN-3971 Skip RMNodeLabelsManager#checkRemoveFromClusterNodeLabelsOfQueue on nodelabel recovery. 
- YARN-4087 Followup fixes after YARN-2019 regarding RM behavior when state-store error occurs. 
- YARN-4092 Fixed UI redirection to print useful messages when both RMs are in standby mode. 
OPTIMIZATION
- HADOOP-11772 RPC Invoker relies on static ClientCache which has synchronized(this) blocks. 
- HADOOP-12317 Applications fail on NM restart on some Linux distro because NM container recovery declares AM container as LOST. 
- HADOOP-7713 dfs -count -q should label output column. 
- HDFS-8856 Make LeaseManager#countPath O(1). 
- HDFS-8867 Enable optimized block reports. 
HDP 2.3.0 provided the following Apache patches:
NEW FEATURES
- HDFS-8008 Support client-side back off when the datanodes are congested. 
- HDFS-8009 Signal congestion on the DataNode. 
- YARN-1376 NM need to notify the log aggregation status to RM through heartbeat. 
- YARN-1402 Update related Web UI and CLI with exposing client API to check log aggregation status. 
- YARN-2498 Respect labels in preemption policy of capacity scheduler for inter-queue preemption. 
- YARN-2571 RM to support YARN registry 
- YARN-2619 Added NodeManager support for disk IO isolation through cgroups. 
- YARN-3225 New parameter of CLI for decommissioning node gracefully in RMAdmin CLI. 
- YARN-3318 Create Initial OrderingPolicy Framework and FifoOrderingPolicy. 
- YARN-3319 Implement a FairOrderingPolicy. 
- YARN-3326 Support RESTful API for getLabelsToNodes. 
- YARN-3345 Add non-exclusive node label API. 
- YARN-3347 Improve YARN log command to get AMContainer logs as well as running containers logs. 
- YARN-3348 Add a 'yarn top' tool to help understand cluster usage. 
- YARN-3354 Add node label expression in ContainerTokenIdentifier to support RM recovery. 
- YARN-3361 CapacityScheduler side changes to support non-exclusive node labels. 
- YARN-3365 Enhanced NodeManager to support using the 'tc' tool via container-executor for outbound network traffic control. 
- YARN-3366 Enhanced NodeManager to support classifying/shaping outgoing network bandwidth traffic originating from YARN containers 
- YARN-3410 YARN admin should be able to remove individual application records from RMStateStore. 
- YARN-3443 Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM. 
- YARN-3448 Added a rolling time-to-live LevelDB timeline store implementation. 
- YARN-3463 Integrate OrderingPolicy Framework with CapacityScheduler. 
- YARN-3505 Node's Log Aggregation Report with SUCCEED should not cached in RMApps. 
- YARN-3541 Add version info on timeline service / generic history web UI and REST API. 
IMPROVEMENTS
- HADOOP-10597 RPC Server signals backoff to clients when all request queues are full. 
- YARN-1880 Cleanup TestApplicationClientProtocolOnHA 
- YARN-2495 Allow admin specify labels from each NM (Distributed configuration for node label). 
- YARN-2696 Queue sorting in CapacityScheduler should consider node label. 
- YARN-2868 FairScheduler: Metric for latency to allocate first container for an application. 
- YARN-2901 Add errors and warning metrics page to RM, NM web UI. 
- YARN-3243 CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits. 
- YARN-3248 Display count of nodes blacklisted by apps in the web UI. 
- YARN-3293 Track and display capacity scheduler health metrics in web UI. 
- YARN-3294 Allow dumping of Capacity Scheduler debug logs via web UI for a fixed time period. 
- YARN-3356 Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. 
- YARN-3362 Add node label usage in RM CapacityScheduler web UI. 
- YARN-3394 Enrich WebApplication proxy documentation. 
- YARN-3397 yarn rmadmin should skip -failover. 
- YARN-3404 Display queue name on application page. 
- YARN-3406 Display count of running containers in the RM's Web UI. 
- YARN-3451 Display attempt start time and elapsed time on the web UI. 
- YARN-3494 Expose AM resource limit and usage in CS QueueMetrics. 
- YARN-3503 Expose disk utilization percentage and bad local and log dir counts in NM metrics. 
- YARN-3511 Add errors and warnings page to ATS. 
- YARN-3565 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String. 
- YARN-3581 Deprecate -directlyAccessNodeLabelStore in RMAdminCLI. 
- YARN-3583 Support of NodeLabel object instead of plain String in YarnClient side. 
- YARN-3593 Add label-type and Improve "DEFAULT_PARTITION" in Node Labels Page. 
- YARN-3700 Made generic history service load a number of latest applications according to the parameter or the configuration. 
BUG FIXES
- HADOOP-11859 PseudoAuthenticationHandler fails with httpcomponents v4.4. 
- HADOOP-7713 dfs -count -q should label output column 
- HDFS-27 HDFS CLI with --config set to default config complains log file not found error. 
- HDFS-6666 Abort NameNode and DataNode startup if security is enabled but block access token is not enabled. 
- HDFS-7645 Fix CHANGES.txt 
- HDFS-7645 Rolling upgrade is restoring blocks from trash multiple times 
- HDFS-7701 Support reporting per storage type quota and usage with hadoop/hdfs shell. 
- HDFS-7890 Improve information on Top users for metrics in RollingWindowsManager and lower log level. 
- HDFS-7933 fsck should also report decommissioning replicas. 
- HDFS-7990 IBR delete ack should not be delayed. 
- HDFS-8008 Support client-side back off when the datanodes are congested. 
- HDFS-8009 Signal congestion on the DataNode. 
- HDFS-8055 NullPointerException when topology script is missing. 
- HDFS-8144 Split TestLazyPersistFiles into multiple tests. 
- HDFS-8152 Refactoring of lazy persist storage cases. 
- HDFS-8205 CommandFormat#parse() should not parse option as value of option. 
- HDFS-8211 DataNode UUID is always null in the JMX counter. 
- HDFS-8219 setStoragePolicy with folder behavior is different after cluster restart. 
- HDFS-8229 LAZY_PERSIST file gets deleted after NameNode restart. 
- HDFS-8232 Missing datanode counters when using Metrics2 sink interface. 
- HDFS-8276 LazyPersistFileScrubber should be disabled if scrubber interval configured zero. 
- YARN-2666 TestFairScheduler.testContinuousScheduling fails Intermittently. 
- YARN-2740 Fix NodeLabelsManager to properly handle node label modifications when distributed node label configuration enabled. 
- YARN-2821 Fixed a problem that DistributedShell AM may hang if restarted. 
- YARN-3110 Few issues in ApplicationHistory web UI. 
- YARN-3136 Fixed a synchronization problem of AbstractYarnScheduler#getTransferredContainers. 
- YARN-3266 RMContext#inactiveNodes should have NodeId as map key. 
- YARN-3269 Yarn.nodemanager.remote-app-log-dir could not be configured to fully qualified path. 
- YARN-3305 Normalize AM resource request on app submission. 
- YARN-3343 Increased TestCapacitySchedulerNodeLabelUpdate#testNodeUpdate timeout. 
- YARN-3383 AdminService should use "warn" instead of "info" to log exception when operation fails. 
- YARN-3387 Previous AM's container completed status couldn't pass to current AM if AM and RM restarted during the same time. 
- YARN-3425 NPE from RMNodeLabelsManager.serviceStop when NodeLabelsManager.serviceInit failed. 
- YARN-3435 AM container to be allocated Appattempt AM container shown as null. 
- YARN-3459 Fix failure of TestLog4jWarningErrorMetricsAppender. 
- YARN-3517 RM web UI for dumping scheduler logs should be for admins only 
- YARN-3530 ATS throws exception on trying to filter results without otherinfo. 
- YARN-3552 RM Web UI shows -1 running containers for completed apps 
- YARN-3580 [JDK8] TestClientRMService.testGetLabelsToNodes fails. 
- YARN-3632 Ordering policy should be allowed to reorder an application when demand changes. 
- YARN-3654 ContainerLogsPage web UI should not have meta-refresh. 
- YARN-3707 RM Web UI queue filter doesn't work. 
- YARN-3740 Fixed the typo in the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS. 

