2014-08-19 01:48 AM
I have a V3210 with Data OnTap 8.1.4P1
Yesterday, the AC failed, and the HP EVA goes offline, and one of the controllers was offline.
When I tried to start the controller, I get this errors:
Aug 18 18:50:08 [localhost:raid.cksum.wc.blkErr:EMERGENCY]: Checksum error due to wafl context mismatch on volume vm_win09_eva, Disk /aggr_fc/plex0/rg0/SWFC-A1:3.126L11 Shelf - Bay - [HP HSV300 0953] S/N [600508B40008ED290000D00002E0000PANIC : raidtop_map: aggr aggr_fc (max vbn 1112666240): vbn 262000935028, no matching range
version: 8.1.4P1: Tue Feb 11 23:23:31 PST 2014
conf : x86_64
cpuid = 0
PANIC: raidtop_map: aggr aggr_fc (max vbn 1112666240): vbn 262000935028, no matching range in SK process wafl_cppri on release 8.1.4P1 on Mon Aug 18 18:50:08 GMT 2014
I only could start the controller in maintenaince mode, and offlined the aggregate, so the controller starts
Each time I online that aggregate, the controller panics and halt.
NetApp support says that the problem is in EVA side, but in Command View I only can see green spots and status OK.
So, can I run a wafl_check or wafliron on the aggregate?
I can't find any information of the message, so I'm little lost....
Can somebody help me?
Best regards and thanks in advance
2014-08-19 03:42 PM
Did NetApp offer anything else besides "the problem is on the EVA side"? I can buy that - maybe data got corrupted when the array lost power - but NetApp should still be able to offer some recovery suggestions - like telling you whether or not you can run wafl_check. With that error, they should be able to give you a pretty good idea of what, exactly, the problem on the EVA side is.
2014-08-19 11:07 PM
Thanks for the answer
I finally started in mainteinance mode, offlined the aggregate that has the problem and ran a wafliron on that aggregate.
It took about 1 hour to 99% and about two hours more to 100%
The process finished deleting some "bad blocks" but I could online the aggregate and started to register VMs
So, the next step should be migrate all the data to another aggrs and recreate it with new vDisks from EVA
Thanks for all
2014-08-20 06:44 AM
Goog to hear, Jose. Why migrate the data and recreate on new vdisks? Sounds like the problem was corruption due to the power drop, not a problem with the EVA or the current vdisks, right? The wafliron should have sorted everything out.