DistCp and Proxy Settings
When using DistCp to back up data from an on-site Hadoop cluster, proxy settings
    may need to be set so as to reach the cloud store.
      For most of the stores, these proxy settings are hadoop configuration
      options which must be set in core-site.xml, or as options
    to the DistCp command.
    
ADLS uses the JVM proxy settings, which need to be set in DistCp's map and reduce
        processes. This can be done through the mapreduce.map.java.opts and
          mapreduce.reduce.java.opts options respectively.
export DISTCP_PROXY_OPTS="-Dhttps.proxyHost=web-proxy.example.com -Dhttps.proxyPort=80" hadoop distcp \ -D mapreduce.map.java.opts="$DISTCP_PROXY_OPTS" \ -D mapreduce.reduce.java.opts="$DISTCP_PROXY_OPTS" \ -update -skipcrccheck -numListstatusThreads 40 \ hdfs://namenode:8020/users/alice adl://backups.azuredatalakestore.net/users/alice
Without these settings, even though access to ADLS may work from the command line, distcp access can fail with "Error fetching access token" —fetching OAuth2 access tokens being the first networked operation performed by the ADLS client.

