Greetings! I am new to iSCSI and I just installed a new FAS2050 configured with dual filers and all 15K SAS drives. It's configured for Active/Active and the disks are split evenly between the 2 filers. I created a 300 gig volume for VMWare on the first filter (fas2050a).
The clients are ESX4 servers, 2 new HP DL385G6 machines and 2 HP DL380G6 machines.
The switches dedicated to iSCSI are 2 ProCurve 2910AL-24G with 10gbit link between them.
Here are the relevant lines from the switch config.
interface 1 name "FAS2050a_e0a" flow-control exit interface 2 name "FAS2050a_e0b" flow-control exit
trunk 1-2 Trk3 LACP
spanning-tree Trk3 priority 4
The DL385s are configured with 8 gbit NIC ports. The DL380s only have 2 (will upgrade later).
The filer fas2050a has both nics connected to one of the switches and are configured as virtual interface vif0 for lacp. Within that virtual inteface I created a virtual interface for vlan 100 and have the switch ports trunked for LACP tagged vlan 100. All interfaces that will be used for iSCSI are setup for an MTU size of 9000. vif0-2 is for our normal "server" vlan segmant and has iSCSI disabled.
fas2050a> ifconfig -a e0a: flags=80908043<BROADCAST,RUNNING,MULTICAST,TCPCKSUM,VLAN> mtu 9000 ether 02:a0:98:12:b4:f4 (auto-1000t-fd-up) flowcontrol full trunked vif0 e0b: flags=80908043<BROADCAST,RUNNING,MULTICAST,TCPCKSUM,VLAN> mtu 9000 ether 02:a0:98:12:b4:f4 (auto-1000t-fd-up) flowcontrol full trunked vif0 lo: flags=1948049<UP,LOOPBACK,RUNNING,MULTICAST,TCPCKSUM> mtu 8160 inet 127.0.0.1 netmask 0xff000000 broadcast 127.0.0.1 ether 00:00:00:00:00:00 (VIA Provider) vif0: flags=80908043<BROADCAST,RUNNING,MULTICAST,TCPCKSUM,VLAN> mtu 9000 ether 02:a0:98:12:b4:f4 (Enabled virtual interface) vif0-2: flags=4948043<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM,NOWINS> mtu 1500 inet 10.0.4.60 netmask 0xfffffc00 broadcast 10.0.7.255 partner vif0-2 (not in use) ether 02:a0:98:12:b4:f4 (Enabled virtual interface) vif0-100: flags=4948043<UP,BROADCAST,RUNNING,MULTICAST,TCPCKSUM,NOWINS> mtu 9000 inet 10.0.100.2 netmask 0xffffff00 broadcast 10.0.100.255 partner vif0-100 (not in use) ether 02:a0:98:12:b4:f4 (Enabled virtual interface)
Since the DL380s only have 2 nics their configuration is simple so I will keep to that for now.
I hace 2 virtual switches with a single nic assigned to each. vSwitch0 is for the sevice console and VM Network, vSwitch1 is for the iSCSI software HBA. I configured both vSwitch1 and vmnic1 for a MTU size of 9000.
[root@ushat-esx03 ~]# esxcfg-vswitch -l Switch Name Num Ports Used Ports Configured Ports MTU Uplinks vSwitch0 32 8 32 1500 vmnic0
PortGroup Name VLAN ID Used Ports Uplinks VM Network 0 5 vmnic0 Service Console 0 1 vmnic0
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks vSwitch1 64 3 64 9000 vmnic1
PortGroup Name VLAN ID Used Ports Uplinks iSCSI_VMkernel0 100 1 vmnic1
vmnic1 has been properly bound to the iSCSI software hba.
[root@ushat-esx03 ~]# esxcli swiscsi nic list -d vmhba33 vmk0 pNic name: vmnic1 ipv4 address: 10.0.100.30 ipv4 net mask: 255.255.255.0 ipv6 addresses: mac address: 00:22:64:c2:2d:9e mtu: 9000 toe: false tso: true tcp checksum: false vlan: true link connected: true ethernet speed: 1000 packets received: 283013 packets sent: 146301 NIC driver: bnx2 driver version: 1.6.9 firmware version: 1.9.6
The test is copying very large files such as ISOs or server images from machine to machine through the local vswitch, or "migrating" virtual machines from the ESX server's fast local storage to the FAS2050 and back.
Here is a sample sysstat output from the filer while copying a 5 gig image file from a file server on the local network to a virtual machine hard drive on the SAN. The numbers are often much lower when migrating a VM from local to iSCSI.
It seems the performance tops out around 60-70 MB/s with rather high CPU usage on the filer. My understanding was we should see closer to 120 MB/s when using gigabit and jumbo frames. When disabling jumbo frames there is hardly any impact on performance.
The performance is a little better when testing on the DL385s where I have 4 nics dedicated to 4 seperate VMkernels with round robin providing 4 active paths (following TR-3749 as a configuration guide)--about 80-90 MB/s.
Am I right to assume we should be seeing quite a bit more throughput from this configuration? I was hoping to see >120 megs/sec since I have 2 gigabit nics in the FAS2050 filter in a LACP trunk, and 4 nics using round robin on the ESX servers AND using jumbo frames.
First off, you REALLY should open a support case on this to make sure it's tracked and also because they have configuration experts who can typically help better.
That being said, I do have some comments.
1. From a single host, you should not expect more than 1Gb/s performance even with trunks/.mulit-mode vifs. It's just a matter of how this technology works. All data from a given host will be sent down the same path by the switch. This is how EtherChannel works. It's port aggregation for wide sharing across multiple hosts, not narrow sharing from a single host.
2. Are you sure you've configured vlan tagging on the switch? You can't just do it on the storage side and not the switch. I'm not an expert at reading switch configs (hence my suggestion for calling support), but I didn't see any reference to this.
3. Are you sure you've configured jumbo frames from end to end (storage, switch ports, and hosts)? If not, you will have performance issues.
Yes, vlan tagging is enabled in the switch for every port/trunk accessing the iSCSI vlan. Sorry I didn't include that line from the switch configuration.
I guess I assumed with round robbin using 4 nics with 4 seperate mac addresses creating 4 active paths all apparently carring i/o at the same time (you can actually see the round robbin working by watching the 4 switch ports activity lights go 1-2-3-4, 1-2-3-4, pretty neat), I would actually get near 4 gb/sec (obviously with overhead). Perhaps I assumed wrong.
As for the jumbo frames, yes, it's enabled end to end. ESX vSwitch -> vmnic -> HP switches -> fas vif LACP, fas vif-100 (target virtual interface vlan).
I've enabled flow-controll all the way through as well.
Normally I wait to call support for anything until I'm sure I have a firm grasp on the problem. I hate calling up and feeling silly when I'm asked a simple question I can not answer because I failed to do my homework. I've read plenty of VMware and NetApp material, now I'm asking the group. Next is opening a ticket.
Hiya guys, I wondered if anyone got anywhere... I have a performance issue also with iSCSI... I'll tell you how I'm getting on...
Basically I have a dl380 g7 ESXi 4.1 host with a 4 port Broadcom iSCSI adapter onboard, and a regular intel quad port NIC in a PCI slot...
I have a dedicated vswitch for iSCSI traffic, and three vMkernel ports, all linked to separate VMHBA's (1 swiscsi vmhba, and 2 broadcom vmhbas). these all have individual IQN's, and the filer has all if these in an igroup and mapped a LUN to the group....
The PSP is set to RoundRobin for the RDM LUN with all three VMHBA's having active paths
on the filer is a dual port multimode VIF (IP Hash) (Does LACP make that much performance difference?) to a trunk group on a procurve switch.
I have the RDM mapped to a 2008 64bit VM (the OS disk is on an NFS export)...
i'm using iometer for the first time (feel like a noob but not really used to generating load), but I did all of the tests, and the CSV output said I'm getting 18-30mb throughput......
is there something better for measuring throughput?
or am I looking for something that isn't going to happen?
You say you have a performance problem, but do you actually have a performance problem (i.e. an application runs slow)? Or did you just run iometer and think you have a performance problem ?
A benchmark is only valid if it simulates the IO characteristics of your application(s). Without that, and without a baseline, a benchmark is pointless and useless.
People often concentrate on throughput, while for many apps IOPS and latency are much more important. Iometer uses lots of random IO, and for random IO you can't expect high throughput (30MB/s can be pretty decent).
We have VMWare hosts running 20 guests running at 25Mbit/s (Mbit, not MBytes!) average on their storage NICs. That's because random IO for small block sizes translates in low throughput but high IOPS. And application performance is excellent !
well sort of both, basically I have a figure that I know a physical server is pushing on it's local disk, and I am trying to get the iSCSI links from the new VM environment to that level (essnetially 153MB/sec)... and I am trying to disprove that iSCSI wont get to that level...