Improving Performance for DistCp
ADLS and WASB
You can tune fs.azure.selfthrottling.read.factor and
fs.azure.selfthrottling.write.factor. Refer to Maximizing HDInsight throughput to Azure Blob Storage blog post.
Amazon S3
If you are planning to copy large amounts of data between HDFS and S3, you can
accelerate the process by passing -D fs.s3a.fast.upload=true while invoking
DistCp. For example:
hadoop distcp -D fs.s3a.fast.upload=true s3a://dominika-test/driver-data /tmp/test2
The fs.s3a.fast.upload option significantly accelerates data upload by
writing the data in blocks, possibly in parallel.
For more tips on how to improve performance for DistCp with S3, refer to Configuring and Tuning S3A Fast Upload.

