Enable GZipCodec as the default compression codec
For the MapReduce framework, update relevant properties in
core-site.xml and mapred-site.xml to enable
GZipCodec as the default compression codec.
-
Edit the
core-site.xmlfile on the NameNode host machine.<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo. LzoCodec,org.apache.hadoop.io.compress.SnappyCodec</value> <description>A list of the compression codec classes that can be used for compression/decompression.</description> </property> -
Edit the
mapred-site.xmlfile on the JobTracker host machine.<property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.type</name> <value>BLOCK</value> </property> - Optional:
Enable the following two configuration parameters to enable job output compression.
Edit the
mapred-site.xmlfile on the Resource Manager host machine.<property> <name>mapreduce.output.fileoutputformat.compress</name> <value>true</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property> - Restart the cluster.

