So we recently had many issues with one of our filers, so our client requested we move majority of the servers off that storage for now.
Started moving one of the bigger servers friday night at 5pm - 1.1TB in total, to the second SAN.
Both SAN's are using SnapMirror to keep data in sync with each other.
By 6:11pm we had our SAN alerting "HA Group Notification from QEN-SAN01-A (TARGET LUN NOSPC) ERROR" and checking our hosts shows the CSV we were copying to was offline (hosted on SAN02-A).
Checking the filers themselves, we had the filer we were copying from (SAN01-A) was complaining about vol space being full - but its CSV was online and running.
Expanding the VOL space for this filer resolved the issues and the units all settled down, but now we are very hesitant to bothresyncthe SnapMirror which is behind (SAN01-A to SAN02-A) nor are we willing to migrate any data again.
We do not have snapshots enabled on the LUN/Volume so there is little space. We were also seeing on SAN01-A alerts that specific hosts (not all) were giving the following error on a regular basis:
ISCSI: Initiator (iqn.1991-05.com.microsoft:HOSTNAME) sent LUN Reset request, aborting all SCSI commands on lun 0
All only for SAN01-A.
I am very new to NetApp FAS units, and sadly these units are second hand and we have no NetApp support (trust me, I've been pushing for it) so I just need some direction to understand:
A) Howdata migration from one SAN LUN to another SAN LUN would cause the volume to max its space - is this due to snapmirror or something else
B) How the vol on SAN01-A caused SAN02-A to drop (we use iSCSI, could a timeouttrying to connect to one filer cause issues on another?)
To replicate storage, a volume snapshot is taken and then replicated. This snapshot exists until it is replaced by the next one for replication. During that time, any blocks changed in the live LUN still need to exist in the snapshot too. So if it's a 20GB LUN, and it deletes 1GB and writes another 1GB, it will need at least 22GB of storage. There are options to autosize the volume that contains the LUN which are probably worth turning on.
If the volume fills up and autosize is off, the LUNs go offline