Yesterday, the AC failed, and the HP EVA goes offline, and one of the controllers was offline.
When I tried to start the controller, I get this errors:
Aug 18 18:50:08 [localhost:raid.cksum.wc.blkErr:EMERGENCY]: Checksum error due to wafl context mismatch on volume vm_win09_eva, Disk /aggr_fc/plex0/rg0/SWFC-A1:3.126L11 Shelf - Bay - [HP HSV300 0953] S/N [600508B40008ED290000D00002E0000PANIC : raidtop_map: aggr aggr_fc (max vbn 1112666240): vbn 262000935028, no matching range
version: 8.1.4P1: Tue Feb 11 23:23:31 PST 2014
conf : x86_64
cpuid = 0
PANIC: raidtop_map: aggr aggr_fc (max vbn 1112666240): vbn 262000935028, no matching range in SK process wafl_cppri on release 8.1.4P1 on Mon Aug 18 18:50:08 GMT 2014
I only could start the controller in maintenaince mode, and offlined the aggregate, so the controller starts
Each time I online that aggregate, the controller panics and halt.
NetApp support says that the problem is in EVA side, but in Command View I only can see green spots and status OK.
So, can I run a wafl_check or wafliron on the aggregate?
I can't find any information of the message, so I'm little lost....
Did NetApp offer anything else besides "the problem is on the EVA side"? I can buy that - maybe data got corrupted when the array lost power - but NetApp should still be able to offer some recovery suggestions - like telling you whether or not you can run wafl_check. With that error, they should be able to give you a pretty good idea of what, exactly, the problem on the EVA side is.
Goog to hear, Jose. Why migrate the data and recreate on new vdisks? Sounds like the problem was corruption due to the power drop, not a problem with the EVA or the current vdisks, right? The wafliron should have sorted everything out.