About Darkstar

Darkstar · ‎2015-10-29

I haven't tried it on anything recent myself, but at least the slightly older IBM-branded FAS270 filers did work with regular NetApp shelves without any problems, I am pretty confident this still holds true for the newer systems. Anything else would be strange, since the shelf hardware and the operating system is identical (except for the copyright string). You probably won't have support from IBM (or Lenovo?) for these shelves but I'm sure you're aware of that

Darkstar · ‎2015-08-31

"Single powered" sounds to me that there's a PSU off in your filer/shelf, are you sure he's talking about the switch? If a PSU fails your NetApp should definitely have sent you an email already. If you want to manage your switches from (Clustered) DataOntap you can use the "system cluster-switch create" command. Which is best practice anyway, so I would ask the partner/company who installed your filers why they didn't bother doing that... They should have known this

Darkstar · ‎2015-08-31

Yes, having node 1 pool0 and node 2 pool1 disks in one stack (and thus in one shelf) is supported. It is not recommended as per TR-3548 (search for "mixed disk pool configurations") but it's supported. You might need to manually assign any replaced disks though, since the controllers can't always figure out by themselves which controller should get any new disks assigned (especially if you have >1 spare per node) -Michael

Darkstar · ‎2015-08-31

So what real-world applications do you have that do not work with the current implementation as-is? Changing something like this would require extensive compatibility testing with operating systems and network stacks going back as far as LAN Manager 3.0 for DOS, so unless you have a very compelling use (and are a very big customer, I guess ) I doubt that anything will be changed there. ESPECIALLY if the actual use-case turns out to be "I just want the result of this API call to be what Windows version X returns" instead of, say, "my application which is a multi-million-dollar SAP or CATIA installation, doesn't work with the current implementation" 😉

Darkstar · ‎2015-08-31

If you have any volumes with UNIX security style, then usermapping needs to be configured. Also you need to do "vsever cifs create" (do not confuse it with "vserver active-directory create" which is something different!) to create a machine account in AD (it's not enough to just manually add a machine account into your AD domain). You can check the secd.log (you can get it via http://<netapp node IP>/spi ) for any errors regarding usermapping and/or security. Of course if you have users in LDAP/NIS that you want to map to (instead of, say, just mapping all windows user to one specific UNIX user) then you need to setup LDAP/NIS as name service But honestly, your partner (the one who sold you the NetApp) should be able to help you with that. Also, it's not often a good idea to use a single SVM for file and block storage at the same time. It's better to separate these into multiple SVMs

Darkstar · ‎2015-08-21

If your data is in any way important for you then I would strongly advise against using such large RAID groups, especially with large SATA disks. You only have two disks of redundancy and the reconstruct of a single 4TB disk may take many days (depending on filer load, I have seen 3TB disks having reconstruct times of 4 or 5 days under heavy load), so when another disk fails in that timeframe you're running completely without protection and every single-block error will lead to corrupt data. Since the difference in capacity is only about 10% I would suggest you use two RAID-DP groups of smaller size (i.e. 11 and 12 disks, with one hot spare) if you still want to do it, make sure you understand the implications. Type the following to enable the override NetApp> options raid.raid_dp.raidsize.override on Regards -Michael

Darkstar · ‎2015-08-21

If it's only filedata that you're interested in, then yes, a simple ufsrestore from Solaris should still work (it definitely did work with NDMP dumps from OnTap 7.3.x). BUT you will most certainly lose all non-UNIX permissions (NFSv4 ACLs, CIFS ACLs, etc.) and, for LUNs, you will lose all metadata and only get the (raw) LUN file. Getting data back from a LUN will be a bit difficult but should be possible if you're not afraid of using hex-editors and loop-mounts under Linux 😉 But if you were using SnapMirror to Tape (SM2T) to dump whole volumes, then a NetApp filer will be your only hope of ever recovering data from it (if you don't want to pay Kroll OnTrack for a recovery 😉 However, you can restore these volumes onto any model of NetApp filer that runs at least the same OnTap version as the filer from which the dump was taken, and which supports at least the same sizes for volumes etc. (so no restoring a 20TB volume onto an old FAS2020 but that should be obvious... )

Darkstar · ‎2015-08-20

Yes, that link seems to be broken or internal (even NetApp partners cannot access it) To answer your questions, there is only one reason where Aggregate snapshots can help you, and that is if you need to restore a COMPLETE aggregate (with ALL volumes in it) to a previous version, for example if you deleted a volume by accident. There is no way to restore single volumes from an aggregate snapshot. It might also help with recovery if you should ever get a corrupt WAFL (highly unlikely). The default for NetApp systems has been for quite some time now to disable aggregate snapshots completely. The only reason you might still need them is if you're running MetroCluster (or Local SyncMirror) because then the Snapshots are needed for mirror reconstruction. But that's something that happens behind the scenes and doesn't help you as an admin with anything So yes, feel free to disable Aggregate snapshot reserve (if you're not on a MetroCluster)

Darkstar · ‎2015-08-20

In that case I would strongly suggest you upgrade to 7.3.7P3 as there are a lot of bugs that have been fixed since 7.3.5*, especially with regard to SMB2 support You should ask your partner/reseller if you need help with upgrading, even though it's rather easy (download new software, install on filer, do takeover/giveback 2 times) there might be dependencies on other software (SnapManager or other 3rd party tools) that you might need to check, and your partner should be able to help you assess these issues. You can check if smb2 is enabled by entering NetApp> options cifs.smb2.enable cifs.smb2.enable on If it's on, you can disable it with NetApp> options cifs.smb2.enable off and re-enable it with NetApp> options cifs.smb2.enable on The settings take effect immediately, and if you have Windows clients that were accessing the filer through SMB2 when you disabled it, these clients might need a reconnect with SMB1, which will happen only after some time (depending on the Windows version and settings it can be up to 15 minutes) or a client reboot (usually faster than just waiting). It should not affect existing sessions but since windows continuously tries to disconnect and reconnect, you should be prepared to expect these long timeouts. Auditing is a bit more involved, you should probably skim over the relevant documentation (File Protocols Access Guide) and over the basic Windows concepts before trying to implement it (misconfiguration can under some circumstances kill your performance or fill up your root volume with tons of logs)

Darkstar · ‎2015-08-20

The first Event (ID 563) happens when a file is opened with FILE_DELETE_ON_CLOSE which is usually used for temporary files. Netapp will automatically delete that file when the last open file handle to it has been closed. Note that you (or rather a program) can also use that flag to force deletion of a file that is currently in use by another program (it still needs the delete-permission to the file itself of course, you cannot delete random files that way ) See for example here or here The second event was introduced with Windows Server 2003 (I think) and is thus not really a "non-standard" event. See here or here for a few details

Darkstar · ‎2015-08-20

What OnTap version are you running? I remember there once was a strange bug with SMB2 that manifested itself in a similar way (files created in a subdirectory disappeared and were later found a few directories further up). But that was with newly created files, not with pre-existing files. Try disabling SMB2 on the filer to see if that helps. Note that this requires all CIFS clients that are currently connected via SMB2 to reconnect to the filer. You can also enable CIFS Auditing on the volume to see who/what does the move operation if it's indeed client-initiated Also you should upgrade to the most recent OnTap version (8.1.4P9 if you're currently at or before 8.1.x, or 8.2.3P5 if you're already on 8.2)

Darkstar · ‎2015-07-31

Hm. Looks like you have a shelf which assumed a soft FC ID. This is not good. It normally means that you have a dumplicate shelf ID in the same stack. But in this case it's the internal disks which is odd. I would suggest you open a support case for this. You might need to re-seat some IO modules or reboot the filer to clear this inconsistent state

Darkstar · ‎2015-07-31

you should definitely let Riverbed do the optimization and disable compression on the filer. We have seen instances of SnapMirror updates being optimized by 85%(!) by using Riverbed between the sites. I doubt NetApp can do that much with compression alone, AND it's easier on the CPUs. As to multi- vs. single path, this only makes a difference if the two paths are each speed-limited on their own, i.e. if you have for example 2 WAN providers. Then you can effectively add the capacity together. It makes no difference if you only have one path anyway (e.g. 2x5mb/s multipath or 1x10mb/s singlepath makes no difference at all)

Darkstar · ‎2015-07-31

You should remove the "cluster" license on the remaining node (since you won't be usin it anymore if you transition to a single system) and reboot the filer. That way it will "forget" that it is a HA pair. You might even get a slight performance boost, since the NVRAM can be used completely by that one node instead of having to be shared with the other node. Then, pull the second controller from the chassis for a few centimeters so that it doesn't confuse the surviving node ("I am not a cluster but I keep seeing another controller installed?!?")

Darkstar · ‎2015-07-31

In cDOT is it possible to have volumes in different vServers that have the same name. This must somehow be mapped to the underlying D-Blade (nodeshell). So the first volume you create for any vServer with a certain name usually gets this name 1:1 mapped to the D-Blade, but as soon as you create another volume with the same name (but in a different vServer) the system has to rename that second volume. But generally you really really shouldn't look at the volume names on the D-Blade (and NEVER EVER change anything volume- or aggregate-related on the nodeshell!) as the cluster makes certain it does everything correctly. Eventually the possibility to see volumes on the nodeshell will go away anyway since it is not really needed anymore

Darkstar · ‎2015-07-31

What the others said, please do an uprade sooner rather than later. 8.0.2 is rather old and since it's an unpatched release contains a few annoying bugs... your NetApp partner should be able to tell you more about it and help you upgrade your system. The hardware in question isn't, by chance, a FAS3140? We had similar problems with a few 3140's once which were only fixed by upgrading everything to a more recent version (OnTap, BIOS, disk/shelf firmware, etc.)

Darkstar · ‎2015-07-31

do a "disk show -a" on both nodes, wait a few minutes, then do it again and compare the results. if they don't match, compare the output of "disk_list" on both nodes (you need to be in diag mode for this). This shows the lowest physical level of the disks. If they don't show up there then it's most probably a hardware issue of some sort. If the disks are there then it's simply a mismatch of what the node(s) think the disks are. This can probably be resolved via "disk assign", "disk remove_ownership", etc.

Darkstar · ‎2015-07-31

This is not related to anything with reallocation. It's simply the deswizzler which has to run through the full volume and update the PVBN references, which involves a lot of Metadata reads which can result in severe cache thrashing. PAM cards on the SnapMirror destination help A LOT with such a workload. An alternative would be to disable the deswizzler but then access to the secondary data will be potentially slow(er) (because every read of a block has to go through the VVBN->PVBN mapping, which is one metadata file) which isn't a big deal usually, but in a DR or migration scenario, when the destination becomes active at some point, you probably don't want to have that extra level of indirection. You can later re-start the deswizzler manually but then again it will take a loooong time to complete.

Darkstar · ‎2015-07-31

Depends on the context. It could mean a disk with a foreign aggregate on it. Or a LUN which can be (or is being) imported via FLI on an 8.3 system Or, generally any LUN visible to the system through an FC initiator with which it doesn't know what to do

Darkstar · ‎2015-07-28

Yes, but they are not actively used. And if one fails or is removed, the system should automatically elect another one as replacement (otherwise every failed mailbox disk would crash the filer in the same way) so if you give the system enough time between the "remove ownership" commands it should work without causing a panic

Darkstar · ‎2015-07-28

Does the filer even boot? When you connect to the SP and type "system console", do you get into the bootloader? if so, try "autoboot" and post the results here. If you get no response at all (i.e. only the "press ctrl-d to return to SP" or whatever it says) and it doesn't react to ctrl-c then it's probably busted. But if you get a bootloader prompt or even some boot messages that could help a lot in diagnosing. You could also try the serial console instead of the SP, in case it's just the SP that is busted. Try connecting via serial and powering up the system. You should see boot messages from the BIOS and then get a bootloader prompt

Darkstar · ‎2015-07-28

You probably need to remove it from WINS manually (if you have WINS configured), which is a bit of a PITA since you have to delete the domain's entry on all WINS servers at the same time and wait for the DCs to repopulate it with the correct list of DCs. On the AD side your AD admin should have used dcpromo.exe (or the new, fancy GUI variant thereof) to remove the DC from the AD

Darkstar · ‎2015-07-28

It will only work when the domain is right, because the NetApp is joined to exactly one domain and can only validate credentials from that domain. So if you joined DOM1 then the user DOM1\foo would be mapped to unix user foo. If a different user (with the same name) from a different domain tries to connect, say, DOM2\foo, he would get an "access denied" since the filer has no means of checking his credentials (the filer knows nothing about domain DOM2, and even if it knew, since it is not joined to DOM2 it could not check the user's credentials)

Darkstar · ‎2015-07-28

Under what user account are you running the batch file? If it runs as LOCALSYSTEM it has different security credentials and probably no access rights to the filer. Does the MKDIR work if you try it (as regular logged in user) from a command prompt? Does CIFS work at all on the filer or do you always get that error? It could be that the clock is out of sync on the NetApp (Kerberos allows for maximum of 300 seconds clock skew). It might also help to reset the password on the filer's domain account by using "cifs changefilerpwd" or reconnect the DCs via "cifs resetdc". If all else fails you can still do "cifs terminate; cifs setup" to re-add the filer to the AD

Darkstar · ‎2015-07-28

Can't you just "cf disable" (or rather "storage failover modify...") before removing disk ownership from those 2 mailbox disks, and re-enable it again afterwards?

Re: NetApp Disk Shelf - IBM N-Series Controller

Re: cDOT cluster switch monitoring

Re: metrocluster pool0 and pool1 on one shelf

Re: NetApp Returns Incorrect Server Info

Re: CIFS Authentication and Permissions Breakdown

Re: Max RAID Group size with NL-SAS

Re: restoring NetApp NDMP backups, FAS3210A needed?

Re: Aggregate snapshots is that the best practice ?

Re: Folders are moving from one location to another without manual intervention

Re: Auditing, Object Open for Delete and Object Access Attempt

Re: Folders are moving from one location to another without manual intervention

Re: Disk only visible to one controller

Re: snapmirror : working with Riverbed WAN optimizers

Re: igroup add: Cluster partner could not be reached, interconnect may be down - Help

Re: cDot volume names do not align with node local commands

Re: Panic - Nested Machine Exception

Re: Disk only visible to one controller

Re: performance issue with snapmirror and snapshots on target aggregate

Re: Can I know what is foreign disk mean?

Re: Hot Shelf Removal - check for mailbox disks first before remove ownership

Re: Zombie Filer

Re: cifs domain discovered-servers show, how to remove one of listed DC servers

Re: cifs-connection to the users unix-home-dir ignores domain-part "domain/username"

Re: STATUS_NOLOGON_WORKSTATION_TRUST_ACCOUNT

Re: Hot Shelf Removal - check for mailbox disks first before remove ownership

ONTAP 9.17.1 EAP