File System Partitioning Recommendations
You can implement various partitioning recommendations for the file system when deploying your Hadoop cluster.
Setting Up File System Partitions
Use the following as a base configuration for all nodes in your cluster:
- 
                Root partition: OS and core program files 
- 
                Swap: Size 2X system memory 
Partitioning Recommendations for Slave Nodes
- 
                Hadoop Slave node partitions: Hadoop should have its own partitions for Hadoop files and logs. Drives should be partitioned using XFS, or ext4 in that order of preference. Do not use LVM; it adds latency and causes a bottleneck. 
- 
                On slave nodes only, all Hadoop partitions should be mounted individually from drives as "/grid/[0-n]". 
- 
                Hadoop Slave Node Partitioning Configuration Example: - 
                        /root - 20GB (ample room for existing files, future log file growth, and OS upgrades) 
- 
                        /grid/0/ - [full disk GB] first partition for Hadoop to use for local storage 
- 
                        /grid/1/ - second partition for Hadoop to use 
- 
                        /grid/2/ - ... 
 
- 
                        
Redundancy (RAID) Recommendations
- 
                Master nodes -- Configured for reliability (RAID 10, dual Ethernet cards, dual power supplies, etc.) 
- 
                Slave nodes -- RAID is not necessary, as failure on these nodes is managed automatically by the cluster. All data is stored across at least three different hosts, and therefore redundancy is built-in. Slave nodes should be built for speed and low cost. 
Further Reading
The following additional documentation might be useful:
- CentOS partitioning documentation
- 
                Reference architectures from other Hadoop clusters: Hadoop Reference Architectures 

