Storage‎ > ‎Storage-disk‎ > ‎

iSCSI setup, configuration and tuning example

Thank you for visiting this page, this page has been update in another link ISCSI setup configuration and tuning example


This article shows an example of ISCSI device plan and installation, host side configuration of both ISCSI and multipath, followed by tunning and debug.

Hardware

In this case, ISCSI target is IBM DS3524, 24 SAS drives, 10k rpm 600GB/each. 8 1GB host interfaces. 4 hosts will use ISCSI storage, vm1, vm2, adm and backup.

Setup

On iscsi DS3524 target side, there are two arrays created:
Array backup, 18 disk drives, raid6, total 8.9TB,

two luns, both segment size is 32KB

one is lun imirror 300GB, designed to be used by adm node.

the other is lun iraid, 8.6TB, to be used by backup

Array vms, 6 disk drives, raid10, total 1.67TB.

25 luns, segment size 128KB, 7*100GB, 14*20GB and 4*40GB. 675GB free space left.

vm1-vm10 for vm1/2 to use.

Two host groups setup. All host type  set to LNXALUA

one is backupgroup, host backup and adm

the other is vmgroup, host vm1 and vm2

In one host group, each host can see others' luns, but one lun can only be mounted on one host at a time.


Name Thin Status Capacity Accessible by Source

backup No Optimal 8,634.585 GB Host Group backupgroup Array backup
mirror No Optimal 300.000 GB Host Group backupgroup Array backup
vm1 No Optimal 100.000 GB Host Group vmgroup Array vms
vm2 No Optimal 100.000 GB Host Group vmgroup Array vms
vm3 No Optimal 100.000 GB Host Group vmgroup Array vms
vm4 No Optimal 100.000 GB Host Group vmgroup Array vms
vm5 No Optimal 100.000 GB Host Group vmgroup Array vms
vm6 No Optimal 100.000 GB Host Group vmgroup Array vms
vm7 No Optimal 100.000 GB Host Group vmgroup Array vms
vm8 No Optimal 20.000 GB Host Group vmgroup Array vms
vm9 No Optimal 20.000 GB Host Group vmgroup Array vms
vm10 No Optimal 20.000 GB Host Group vmgroup Array vms
vm11 No Optimal 20.000 GB Host Group vmgroup Array vms
vm12 No Optimal 20.000 GB Host Group vmgroup Array vms
vm13 No Optimal 20.000 GB Host Group vmgroup Array vms
vm14 No Optimal 20.000 GB Host Group vmgroup Array vms
vm15 No Optimal 20.000 GB Host Group vmgroup Array vms
vm16 No Optimal 20.000 GB Host Group vmgroup Array vms
vm17 No Optimal 20.000 GB Host Group vmgroup Array vms
vm18 No Optimal 20.000 GB Host Group vmgroup Array vms
vm19 No Optimal 20.000 GB Host Group vmgroup Array vms
vm20 No Optimal 20.000 GB Host Group vmgroup Array vms
vm21 No Optimal 20.000 GB Host Group vmgroup Array vms
vm22 No Optimal 40.000 GB Host Group vmgroup Array vms
vm23 No Optimal 40.000 GB Host Group vmgroup Array vms
vm24 No Optimal 40.000 GB Host Group vmgroup Array vms
vm25 No Optimal 40.000 GB Host Group vmgroup Array vms

Network

In this case, ISCSI is pretty isolated from current network infrastracture, so we don't use CHAP authentication, no iSNS server for discovery either
Each host as two NICs(adm's second NIC needs more work), accordingly, there are four vlans created.
All ports are using ipv4, MTU 9000, flow control enabled.

vlan 390

         vm1 eth1     192.168.130.200
         vm2 eth1     192.168.130.201
         iscsiA port3 192.168.130.1
         iscsiB port4 192.168.130.2
vlan 391
vm1 eth3 192.168.131.200 vm2 eth3 192.168.131.201 iscsiA port4 192.168.131.1 iscsiB port3 192.168.131.2
vlan 392
backup eth0 192.168.132.200 adm eth3 192.168.132.201 iscsiA port5 192.168.132.1 iscsiB port6 192.168.132.2
vlan 393
backup eth1 192.168.133.200 adm eth4 192.168.133.201 iscsiA port6 192.168.133.1 iscsiB port5 192.168.133.2

Host side installation:

#uname -a Linux 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 11:21:14 CST 2012 x86_64 x86_64 x86_64 GNU/Linux # rpm -qa | grep mapper device-mapper-event-1.02.74-10.el6.x86_64 device-mapper-event-libs-1.02.74-10.el6.x86_64 device-mapper-multipath-0.4.9-56.el6.x86_64 device-mapper-multipath-libs-0.4.9-56.el6.x86_64 device-mapper-1.02.74-10.el6.x86_64 device-mapper-libs-1.02.74-10.el6.x86_64
# yum install iscsi-initiator-utils ================================================================================ Package Arch Version Repository Size ================================================================================ Installing: iscsi-initiator-utils x86_64 6.2.0.872-34.el6 sl 614 k Transaction Summary ================================================================================ Install 1 Package(s)
On backup, multipath configured like the following devices { device { vendor "IBM" product "1746" getuid_callout "/lib/udev/scsi_id --page=0x83 --whitelisted --device=/dev/%n" features "2 pg_init_retries 5" hardware_handler "1 rdac" path_selector "round-robin 0" path_grouping_policy group_by_prio failback immediate rr_weight priorities no_path_retry fail rr_min_io 1000 path_checker rdac prio rdac } } blacklist { device { vendor "Kingston" product "DT*" } device { vendor "ServeRA" product "*" } devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z]" } multipaths { multipath { wwid 360080e50002dfdb60000699750addb4f alias imirror } multipath { wwid 360080e50002dfdb60000699a50addb7c alias iraid } ... }

Configuring Open-iSCSI initiator utilities

iSCSI initiator configuation file is /etc/iscsi/iscid.conf, you can use it as it is, also you can tune it according your setup environment, I'll mention tunning part in later session. Here is the origional of the file configuration.

iscsid.startup = /etc/rc.d/init.d/iscsid force-start
node.startup = automatic
node.leading_login = No
node.session.timeo.replacement_timeout = 120
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.initial_login_retry_max = 8
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.xmit_thread_priority = -20
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.conn[0].iscsi.HeaderDigest = None
node.session.nr_sessions = 1
node.session.iscsi.FastAbort = Yes
There is also a configuration file /etc/iscsi/initiatorname.iscsi, this is the configuration file which stores Initiator, use/sbin/iscsi-iname to generate one, or you can also manually change it, etc..
InitiatorName=iqn.1994-05.com.redhat:vm1

Tunning:

As for ISCSI host side configuration, some parameters have been changed, in /etc/iscsi/iscsid.conf

Origional

node.session.timeo.replacement_timeout = 120
node.session.cmds_max = 128
node.session.queue_depth = 32

My settings

node.session.timeo.replacement_timeout = 15
node.session.cmds_max = 1024
node.session.queue_depth = 128

On top of multipath device, block readahead is set to 16584(the best).

On iscsi target part, cache size set to 32KB. path fail alert set to 60 minutes.

Connecting to the iSCSI array
The file /etc/iscsi/initiatorname.iscsi should contain an initiator name for your iSCSI client host. You need to include this initiator name on your iSCSI array's configuration for this specific iSCSI client host.
If you haven't yet started the iSCSI daemon, run the following command before we commence with discovering targets.

service iscsid start

Once the iscsid service is running and the client's initiator name is configured on the iSCSI array, then you may proceed with the following command to discover available targets.
The following command would return the available targets.

iscsiadm -m discovery -t sendtargets -p 192.168.132.1 iscsiadm -m discovery -t sendtargets -p 192.168.132.2 iscsiadm -m discovery -t sendtargets -p 192.168.133.1 iscsiadm -m discovery -t sendtargets -p 192.168.133.2

Then remove nodes discovered but not in the same vlan(this is the part I don't like iSCSI utility), you can find other ways to get rid of them, but I found this is good way to do. rm -rf /var/lib/iscsi/nodes/iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9/192.168.130.1,3260,1 rm -rf /var/lib/iscsi/nodes/iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9/192.168.130.2,3260,2 rm -rf /var/lib/iscsi/nodes/iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9/192.168.131.1,3260,1 rm -rf /var/lib/iscsi/nodes/iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9/192.168.131.2,3260,2
Same way, on the nodes on other vlan, do the similar thing to discover iSCSI target.
Then, start iscsi and check if iSCSI target devices showing up.
# /etc/init.d/iscsi start
# iscsiadm -m node
192.168.130.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
192.168.131.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
192.168.130.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
192.168.131.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9

You can also check session status
# iscsiadm -m session
tcp: [1] 192.168.130.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
tcp: [2] 192.168.131.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
tcp: [3] 192.168.130.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
tcp: [4] 192.168.131.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
Troubleshooting information with -P (print) flag and verbosity level 0-3:
iscsiadm -m session -P3
iSCSI Transport Class version 2.0-870
version 6.2.0-873.2.el6
Target: iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
Current Portal: 192.168.130.2:3260,2
Persistent Portal: 192.168.130.2:3260,2
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:7ee624947bae
Iface IPaddress: 192.168.130.201
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
*********
Timeouts:
*********
Recovery Timeout: 15
Target Reset Timeout: 30
...


Check multipath devices.
# multipath -ll iraid (360080e50002dfdb60000699a50addb7c) dm-7 IBM,1746 FAStT size=8.4T features='2 pg_init_retries 5' hwhandler='1 rdac' wp=rw |-+- policy='round-robin 0' prio=14 status=active | |- 14:0:0:0 sdf 8:80 active ready running | `- 11:0:0:0 sde 8:64 active ready running `-+- policy='round-robin 0' prio=9 status=enabled |- 12:0:0:0 sdc 8:32 active ready running `- 13:0:0:0 sdd 8:48 active ready running imirror (360080e50002dfdb60000699750addb4f) dm-8 IBM,1746 FAStT size=300G features='2 pg_init_retries 5' hwhandler='1 rdac' wp=rw |-+- policy='round-robin 0' prio=14 status=active | |- 14:0:0:1 sdj 8:144 active ready running | `- 11:0:0:1 sdi 8:128 active ready running `-+- policy='round-robin 0' prio=9 status=enabled |- 13:0:0:1 sdh 8:112 active ready running `- 12:0:0:1 sdg 8:96 active ready running
....

Make file system

mkfs.xfs -f -L iraid -d su=64k,sw=16 -b size=4096 -s size=4096 /dev/mapper/iraid

Use netdev option for iscsi devices in fstab

/dev/mapper/iraid      /iraid          xfs      _netdev         0 0 
/dev/mapper/imirror    /imirror        xfs      _netdev,noauto  0 0 

Iozone test result on vm2 for iraid

        File size set to 201326592 KB 
        Record Size 32 KB 
        Command line used: iozone -s192g -i 0 -i 1 -r32 -j 16 -t 1 
        Output is in Kbytes/sec 
        Time Resolution = 0.000001 seconds. 
        Processor cache size set to 1024 Kbytes. 
        Processor cache line size set to 32 bytes. 
        File stride size set to 16 * record size. 
        Throughput test with 1 process 
        Each process writes a 201326592 Kbyte file in 32 Kbyte records 

        Children see throughput for  1 initial writers  =  250440.28 KB/sec 
        Parent sees throughput for  1 initial writers   =  238661.17 KB/sec 
        Min throughput per process                      =  250440.28 KB/sec 
        Max throughput per process                      =  250440.28 KB/sec 
        Avg throughput per process                      =  250440.28 KB/sec 
        Min xfer                                        = 201326592.00 KB 

        Children see throughput for  1 rewriters        =  250842.19 KB/sec 
        Parent sees throughput for  1 rewriters         =  239207.47 KB/sec 
        Min throughput per process                      =  250842.19 KB/sec 
        Max throughput per process                      =  250842.19 KB/sec 
        Avg throughput per process                      =  250842.19 KB/sec 
        Min xfer                                        = 201326592.00 KB 

        Children see throughput for  1 readers          =  253003.98 KB/sec 
        Parent sees throughput for  1 readers           =  253002.54 KB/sec 
        Min throughput per process                      =  253003.98 KB/sec 
        Max throughput per process                      =  253003.98 KB/sec 
        Avg throughput per process                      =  253003.98 KB/sec 
        Min xfer                                        = 201326592.00 KB 

        Children see throughput for 1 re-readers        =  250448.03 KB/sec 
        Parent sees throughput for 1 re-readers         =  250445.81 KB/sec 
        Min throughput per process                      =  250448.03 KB/sec 
        Max throughput per process                      =  250448.03 KB/sec 
        Avg throughput per process                      =  250448.03 KB/sec 
        Min xfer                                        = 201326592.00 KB 


According to the results above, NIC channels are pretty much saturated. For smaller file test, reading could reach to 450MB/sec, benefit from cached in memory. This could be the useful for virtual machines.

Iozone test on vm2 for vm1

        Run began: Tue Dec  4 11:23:08 2012 

        File size set to 100663296 KB 
        Record Size 128 KB 
        Command line used: iozone -s96g -i 0 -i 1 -r128 -j 3 -t 1 
        Output is in Kbytes/sec 
        Time Resolution = 0.000001 seconds. 
        Processor cache size set to 1024 Kbytes. 
        Processor cache line size set to 32 bytes. 
        File stride size set to 3 * record size. 
        Throughput test with 1 process 
        Each process writes a 100663296 Kbyte file in 128 Kbyte records 

        Children see throughput for  1 initial writers  =  255580.42 KB/sec 
        Parent sees throughput for  1 initial writers   =  231270.81 KB/sec 
        Min throughput per process                      =  255580.42 KB/sec 
        Max throughput per process                      =  255580.42 KB/sec 
        Avg throughput per process                      =  255580.42 KB/sec 
        Min xfer                                        = 100663296.00 KB 

        Children see throughput for  1 rewriters        =  237321.41 KB/sec 
        Parent sees throughput for  1 rewriters         =  215502.98 KB/sec 
        Min throughput per process                      =  237321.41 KB/sec 
        Max throughput per process                      =  237321.41 KB/sec 
        Avg throughput per process                      =  237321.41 KB/sec 
        Min xfer                                        = 100663296.00 KB 

        Children see throughput for  1 readers          =  251665.06 KB/sec 
        Parent sees throughput for  1 readers           =  251659.90 KB/sec 
        Min throughput per process                      =  251665.06 KB/sec 
        Max throughput per process                      =  251665.06 KB/sec 
        Avg throughput per process                      =  251665.06 KB/sec 
        Min xfer                                        = 100663296.00 KB 

        Children see throughput for 1 re-readers        =  261702.75 KB/sec 
        Parent sees throughput for 1 re-readers         =  261702.26 KB/sec 
        Min throughput per process                      =  261702.75 KB/sec 
        Max throughput per process                      =  261702.75 KB/sec 
        Avg throughput per process                      =  261702.75 KB/sec 
        Min xfer                                        = 100663296.00 KB 






Comments