While building a linux box as a hard drive farm, I wanted to get the fastest transport possible between the machines using my existing Infiniband fabric (10Gbps), and all research lead to methods using RDMA. I had to rule out NFS/RDMA, as there is no NFS client delivered with Windows 7 Home Premium. I looked at iSCSI, and there is an implementation which uses RDMA called SRP. Having tried and initially failed, I gave up for a few months and was happy enough with a samba link over IPoIB. That was giving me about 130MB/sec reads. I still would have love to get SRP working, as that should theoretically be a lot faster.
Then, someone dropped me a comment on one of my Infiniband blog articles about how they’d got SRP working between a Windows 7 machine and a SLES target. Since this confirmed that Windows 7 was capable of SRP, I had another go. I had thought that it was a limitation of the OFED drivers on Windows 7, but it turns out that I was incorrect, it was mis-configuration on the Ubuntu side. This article take you through that setup, after which we’ll have an SRP target on the Ubuntu box, accessed from an initiator on a Windows 7 box.
Firstly, here’s some acronyms:
SCSI – we should know what this one is (Small Computer Systems Interface)
iSCSI – SCSI protocol over a network. (Internet SCSI).
SCST – SCSI Target subsystem for linux with target drivers for iSCSI, Fibre Channel, SRP, SAS, FCoE, etc. We’re most interested in the SRP target.
RDMA – Remote Direct Memory Access – fast way of copying chunks of memory from one machine to another.
SRP – SCSI RDMA Protocol – wraps it all up using RDMA over SCSI protocol.
This article assumes that you’ve already got Infiniband working. See link to “Enabling Infiniband on Ububtu 10.10″ article below which guides you though that.
I found that the easiest way to get SCST (SCSI Target) with SRP (SCSI RDMA Protocol) was to patch the ubuntu kernel source and run a custom kernel. I haven’t tried the latest Ubuntu installs, hopefully it’s included in those and that a kernel rebuild is not necessary. For the moment, I patched my kernel and booted it.
Patch broke kernel. I had to add
#define REQ_WRITE 0×4
Your kernel might be ok, but that fix allowed my compile to complete.
Once the kernel was up and running, there’s various packages to be installed. I didn’t keep track of these, apologies. However, I do intend to do a complete rebuild soon, so I’ll update the main article with the complete list from start to finish, including the basic infiniband setup all the way to setting up SRP targets so you can have a clean install. Ubuntu 11.10 might even have the scst stuff built into the kernel, eliminating the need to build a custom kernel.
The srp_daemon was complaining on startup about missing /dev/class/infiniband/uverbs* files, so again the udev rules had to be edited to create these. Also, the libmthca1 package needed to be installed to provide user-space drivers for the Mellanox HCAs for use with the ibverbs library.
Now, onto the SCST (SCSI target subsystem) setup. We should have the ib_srp driver installed and available as a target to scstadmin. We can verify this by using “scstadmin -list_target” go show the ib_srpt target is available.
scstadmin -clear_config -force scstadmin -open_dev DISK01 -handler vdisk_blockio -attributes filename=/dev/sdg1 scstadmin -set_dev_attr DISK01 -attributes t10_dev_id=0x2345 scstadmin -add_group HOST01 -driver ib_srpt -target ib_srpt_target_0 scstadmin -add_lun 0 -driver ib_srpt -target ib_srpt_target_0 -group HOST01 -device DISK01 -attributes read_only=0 scstadmin -add_init 0x2c902002200bc -driver ib_srpt -target ib_srpt_target_0 -group HOST01 scstadmin -add_init 0x0002c9020021f9fc0002c902002200bc -driver ib_srpt -target ib_srpt_target_0 -group HOST01 scstadmin -enable_target ib_srpt_target_0 -driver ib_srpt scstadmin -write_config /etc/scst.conf
Clear the current config (only if that’s ok, and you don’t have any other config that you want to keep. I do this because I’m starting with a clean slate).
scstadmin -clear_config -force
Create DISK01, assigning it to a partition (/dev/sdg1, /dev/md0p1, etc., etc.). I’m using a disk partition in this example.
scstadmin -open_dev DISK01 -handler vdisk_blockio -attributes filename=/dev/sdg1
Now set the drive attributes.
scstadmin -set_dev_attr DISK01 -attributes t10_dev_id=0x2345
Now add a group
scstadmin -add_group HOST01 -driver ib_srpt -target ib_srpt_target_0
Add a LUN to the group, assigning it to DISK01
scstadmin -add_lun 0 -driver ib_srpt -target ib_srpt_target_0 -group HOST01 -device DISK01 -attributes read_only=0
Add an initiator to the group, allowing the initiator to connect to our new target. I got this from watching /var/log/messages while I was disabling and enabling the Infiniband SRP miniport in device manager on the Win7 box. This caused the SRP miniport to attempt to connect to the Ubuntu target, and that attempt is shown in the messages file along with the initiator ID.
scstadmin -add_init 0x0002c9020021f9fc0002c902002200bc -driver ib_srpt -target ib_srpt_target_0 -group HOST01
Finally enable the target
scstadmin -enable_target ib_srpt_target_0 -driver ib_srpt
And write the config.
scstadmin -write_config /etc/scst.conf
At this point I saw a new drive appear on the Win7 box, and it was asking me to format it. I formatted it with NTFS, 16K blocks, then did some benchmarks. 250MB/sec reads and 50MB/sec writes. The 50MB/sec writes were down to the speed of the software RAID on linux. Once I split the drives into a couple of RAID0 arrays, the writes went up to 200MB/sec (two drives in RAID0).
So, the SRP issue is resolved, now I need a hardware RAID card with enough horsepower to push the drives at full speed. I’m looking at an 8-port RAID card with PCI-e 4x or 8x. I’ll soon have a spare PCI-e 8x slot in my motherboard once I move the graphics card to a 1x slot. I’ve hacked the 1x slot so I can fit a larger card into it. I don’t need a fast gfx card, and I’ve even disabled the graphical desktop on the machine, all I need is a text-based interface. So a PCI-e 1x graphics card should be fine. It’s also interesting to note that my motherboard has 20 PCI-express lanes in total, 16 for the two PCI-e 16x slots, but the 16 lanes are shared. It’s either 16x + 1x or 8x + 8x, depending on the SLI settings in the BIOS. That leaves 3 or 4 lanes for the PCI-e 4x slot (2 lanes) and 1 or two left for the motherboard RAID controller. Not enough really, so I want to get the new RAID controller into one of the PCIe 16x slots using 4-8 PCIe lanes on the mobo. Who’d have known that the slot sizes don’t actually have that amount of lanes assigned. That’s one of the cheats that manufacturers use to bring down the price of motherboards.
More to come soon….
Edit: See what happened when I got the RAID controller…
SCST - http://scst.sourceforge.net/
Rebuild a new linux kernel https://help.ubuntu.com/community/Kernel/Compile
Installing iSCSI-SCST - http://iscsi-scst.sourceforge.net/iscsi-scst-howto.txt