Storage‎ > ‎Storage-disk‎ > ‎

XFS optimisation on large array

Thank you for visiting this page, this page has been update in another link XFS optimisation on large array
XFS is a 64-bit, high-performance journaling file system created by SGI. It provides direct IO implementation that allows non-cached I/O directly to userspace. Data is transferred between the application's buffer and the disk using DMA, which allows access to the full I/O bandwidth of the underlying disk devices. However, for compatibility reason, most of storage devices, supports default historical block size and sector size. So, no doubt that  let XFS know the underlying storage information can improve XFS performance.

Here are list of parameters you should know before you create a XFS filesystem on a storage array.

block size, sector size, stripe width, stripe size, log device.

Here are description of each of them


Specify the fundamental allocation block size of the filesystem.
The default value is 4KB, the minimum is 512 bytes, and the maximum is 64KB
XFS on Linux currently only supports pagesize or smaller blocks.
To create a filesystem with a block size of 2048 bytes you would use:
mkfs.xfs -b size=2048 device
Smaller block sizes reduce wasted space for lots of small files.


     -s sector_size
        This option specifies the fundamental sector size of the filesystem.  The sector_size is specified either as a value  in  bytes
        with  size=value  or as a base two logarithm value with log=value.  The default sector_size is 512 bytes. The minimum value for
        sector size is 512; the maximum is 32768 (32 KiB). The sector_size must be a power of 2 size and cannot be made larger than the
        filesystem block size.

In most of cases, block and sector size should be set 4096, which is same as default linux kernel page size, both of them can not go beyond them. Linux kernel size is tunable, be sure you know what you are doing before you started. Or, for applications you know that use smaller block size, set them same.

stripe width, stripe size

Stripe size is the number of blocks (sometimes expressed in bytes) that are written to one disk drive, before moving on to the next disk drive in the array, it's also called segment size, or chunk size, it's the smallest unit in an array, same as block size means to a tradition disk.
Stripe width is the amount of data contained in a single RAID stripe (segment size × number of data-bearing disks)


xfsprogs 3.1.0 and newer will use the blkid library to correctly identify stripe geometry for lvm, md, and some hardware raid devices which export this information, see blkid useful examples

Log device

The journal log can be on a different device to the rest of the filesystem
  • At least 512 filesystem blocks
  • No more than 64K blocks or 128MB, whichever is smaller
  • Defaults to maximum size for >1TB filesystems
Log device could be a device with better IOPS performance
  • 15K RPM disk or battery-backed memory
mkfs.xfs -l logdev=log_device device
mount -o logdev=log_device device path

Not like other articles which I use a lot of examples, for this one, once you know concepts, then it's easier to create an optimized xfs file system.

Here is one of my exampe which created a xfs file system for a external LUN. The LUN is a 13+2 RAID6 array
mkfs -V -t xfs -f -d su=128k,sw=13 -b size=4096 -s size=4096 -L /dc06_lun5 /dev/mapper/dcunit06_lun5

In the xfs create command:
       This is used to specify the stripe unit for a RAID device or a logical volume. The value has  to  be  specified  in
       512-byte  block  units.  Use the su suboption to specify the stripe unit size in bytes. This suboption ensures that
       data allocations will be stripe unit aligned when the current end of file is being extended and the  file  size  is
       larger than 512KiB. Also inode allocations and the internal log will be stripe unit aligned.

       This  is an alternative to using sunit.  The su suboption is used to specify the stripe unit for a RAID device or a
       striped logical volume. The value has to be specified in bytes, (usually using the m or  g  suffixes).  This  value
       must be a multiple of the filesystem block size.

       This  is used to specify the stripe width for a RAID device or a striped logical volume. The value has to be speci-
       fied in 512-byte block units. Use the sw suboption to specify the stripe width size in bytes.   This  suboption  is
       required if -d sunit has been specified and it has to be a multiple of the -d sunit suboption.

       suboption  is  an  alternative  to  using  swidth.  The sw suboption is used to specify the stripe width for a RAID
       device or striped logical volume. The value is expressed as a multiplier of the stripe unit, usually  the  same  as
       the number of stripe members in the logical volume configuration, or data disks in a RAID device.

       When  a  filesystem is created on a logical volume device, mkfs.xfs will automatically query the logical volume for
       appropriate sunit and swidth values.

Here is the link of XFS user guide