<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Checksum error, bad data, WAFL inconsistent in ONTAP Discussions</title>
    <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32863#M7674</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Today I've noticed rather worrying messages on one of the filers, saying that there are four bad blocks on one of the volumes, that WAFL is inconsistent and scrub starting. What's interesting, I haven't received any messages from the Unified Manager, neither do I see any errors on volume and aggregate in question.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What I'm concerned about is absence of any messages saying that WAFL has recovered from the parity data. So the question are:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Is the volume now corrupted or not?&lt;/LI&gt;&lt;LI&gt;Why filer hasn't marked disk drive as failed and hasn't started rebuilding to a spare drive?&lt;/LI&gt;&lt;LI&gt;What do I need to do to recover from the issue?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;filer_name&amp;gt; Thu May 16 08:54:05 EST [filer_name: raid.cksum.wc.blkErr:EMERGENCY]: Checksum error due to wafl context mismatch on volume volume_name, Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF], block 31885141, buftree id 0, inode number 101, snapid 106, file block 45778970, level 0: checksum context has buftree id 137615, file block 76494408.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:05 EST [filer_name: raid.cksum.wc.blkErr:EMERGENCY]: Checksum error due to wafl context mismatch on volume volume_name, Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF], block 31885144, buftree id 0, inode number 101, snapid 106, file block 45778973, level 0: checksum context has buftree id 8351367, file block 213319903.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:05 EST [filer_name: raid.data.lw.blkErr:CRITICAL]: Bad data detected on Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF], block #31885141&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: raid.multierr.bad.block:CRITICAL]: Marking on 'Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF]', block number 31885141, as bad block.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: wafl.incons.userdata.vol:error]: WAFL inconsistent: volume volume_name has a corrupted user data block. Note: Any new Snapshot copies might contain this inconsistency.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: wafl.raid.incons.userdata&amp;amp;colon;error]: WAFL inconsistent: bad user data block 1208910165 (vvbn:76495103 fbn:45778970 level:0) in inode (fileid:101 snapid:106 file_type:15 disk_flags:0x8402) in volume volume_name.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: coredump.micro.completed:info]: Microcore (/etc/crash/micro-core.151702107.2013-05-15.22_54_06) generation completed&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:15 EST [filer_name: raid.rg.scrub.start:notice]: /aggr0/plex0/rg1: starting scrub&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:41 EST [filer_name: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from filer_name (WAFL INCONSISTENT) ERROR&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:43 EST [filer_name: asup.smtp.sent.minicore:notice]: Core file 'micro-core.151702107.2013-05-15.22_54_06' sent to NetApp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 16 May 2013 00:15:48 GMT</pubDate>
    <dc:creator>KemDatacenter</dc:creator>
    <dc:date>2013-05-16T00:15:48Z</dc:date>
    <item>
      <title>Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32863#M7674</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Today I've noticed rather worrying messages on one of the filers, saying that there are four bad blocks on one of the volumes, that WAFL is inconsistent and scrub starting. What's interesting, I haven't received any messages from the Unified Manager, neither do I see any errors on volume and aggregate in question.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What I'm concerned about is absence of any messages saying that WAFL has recovered from the parity data. So the question are:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Is the volume now corrupted or not?&lt;/LI&gt;&lt;LI&gt;Why filer hasn't marked disk drive as failed and hasn't started rebuilding to a spare drive?&lt;/LI&gt;&lt;LI&gt;What do I need to do to recover from the issue?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;filer_name&amp;gt; Thu May 16 08:54:05 EST [filer_name: raid.cksum.wc.blkErr:EMERGENCY]: Checksum error due to wafl context mismatch on volume volume_name, Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF], block 31885141, buftree id 0, inode number 101, snapid 106, file block 45778970, level 0: checksum context has buftree id 137615, file block 76494408.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:05 EST [filer_name: raid.cksum.wc.blkErr:EMERGENCY]: Checksum error due to wafl context mismatch on volume volume_name, Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF], block 31885144, buftree id 0, inode number 101, snapid 106, file block 45778973, level 0: checksum context has buftree id 8351367, file block 213319903.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:05 EST [filer_name: raid.data.lw.blkErr:CRITICAL]: Bad data detected on Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF], block #31885141&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: raid.multierr.bad.block:CRITICAL]: Marking on 'Disk /aggr0/plex0/rg0/1a.71 Shelf 4 Bay 7 [NETAPP&amp;nbsp;&amp;nbsp; X291_S15K7420F15 NA00] S/N [3SK1Z4PQ00009123NQHF]', block number 31885141, as bad block.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: wafl.incons.userdata.vol:error]: WAFL inconsistent: volume volume_name has a corrupted user data block. Note: Any new Snapshot copies might contain this inconsistency.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: wafl.raid.incons.userdata&amp;amp;colon;error]: WAFL inconsistent: bad user data block 1208910165 (vvbn:76495103 fbn:45778970 level:0) in inode (fileid:101 snapid:106 file_type:15 disk_flags:0x8402) in volume volume_name.&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:06 EST [filer_name: coredump.micro.completed:info]: Microcore (/etc/crash/micro-core.151702107.2013-05-15.22_54_06) generation completed&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:15 EST [filer_name: raid.rg.scrub.start:notice]: /aggr0/plex0/rg1: starting scrub&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;...&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:41 EST [filer_name: asup.smtp.sent:notice]: Cluster Notification mail sent: Cluster Notification from filer_name (WAFL INCONSISTENT) ERROR&lt;/P&gt;&lt;P style="padding-left: 30px;"&gt;Thu May 16 08:54:43 EST [filer_name: asup.smtp.sent.minicore:notice]: Core file 'micro-core.151702107.2013-05-15.22_54_06' sent to NetApp&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 16 May 2013 00:15:48 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32863#M7674</guid>
      <dc:creator>KemDatacenter</dc:creator>
      <dc:date>2013-05-16T00:15:48Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32868#M7675</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;1.       It is possible that corruption is confined to some snapshots and active file system is OK. Only support can tell.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2.       Error is software one, not hardware. There is no reason to mark disk as bad. It may be caused by hardware - again, support could probably analyze it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;3.       The first thing you need is open case.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 16 May 2013 08:35:32 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32868#M7675</guid>
      <dc:creator>aborzenkov</dc:creator>
      <dc:date>2013-05-16T08:35:32Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32873#M7676</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;How can bad block be a software issue? You mean possible Data ONTAP bug?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 03 Jun 2013 02:43:56 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32873#M7676</guid>
      <dc:creator>KemDatacenter</dc:creator>
      <dc:date>2013-06-03T02:43:56Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32878#M7677</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Have you resolved this? We are dealing with this right now.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 01 Jul 2013 18:10:02 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32878#M7677</guid>
      <dc:creator>cliffwilliams44</dc:creator>
      <dc:date>2013-07-01T18:10:02Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32884#M7678</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Contact support as you may need to run WALF_IRON and should only do so after consulting Netapp.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 01 Jul 2013 23:01:02 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32884#M7678</guid>
      <dc:creator>gavin_meadows</dc:creator>
      <dc:date>2013-07-01T23:01:02Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32887#M7679</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;We have corrected the problem, then failed the drive and replaced it. All if good now.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Sent from my iPad&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 01 Jul 2013 23:05:21 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32887#M7679</guid>
      <dc:creator>cliffwilliams44</dc:creator>
      <dc:date>2013-07-01T23:05:21Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32892#M7681</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;No, this is indeed a hardware problem. It has to do with the drive firmware not reporting media errors in time (note that the drives in this post are firmware NA00, while NA03 is the latest)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;See &lt;A href="http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&amp;amp;Display=606576" target="_blank"&gt;http://support.netapp.com/NOW/cgi-bin/bol?Type=Detail&amp;amp;Display=606576&lt;/A&gt; for the bug.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There were some TSBs sent to partners last year that mentioned this problem (I don't remember the number right now)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;-Michae&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 04 Jul 2013 12:35:00 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/32892#M7681</guid>
      <dc:creator>Darkstar</dc:creator>
      <dc:date>2013-07-04T12:35:00Z</dc:date>
    </item>
    <item>
      <title>Re: Checksum error, bad data, WAFL inconsistent</title>
      <link>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/107783#M22309</link>
      <description>&lt;P&gt;This is an old post, but we were just informed of the following BURT which seems to match what you ran into. &amp;nbsp;No idea if you resolved the error and/or the problem went away, but here's the bug:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&amp;amp;Display=724468" target="_blank"&gt;http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&amp;amp;Display=724468&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Chris&lt;/P&gt;</description>
      <pubDate>Tue, 28 Jul 2015 16:41:57 GMT</pubDate>
      <guid>https://community.netapp.com/t5/ONTAP-Discussions/Checksum-error-bad-data-WAFL-inconsistent/m-p/107783#M22309</guid>
      <dc:creator>colsen</dc:creator>
      <dc:date>2015-07-28T16:41:57Z</dc:date>
    </item>
  </channel>
</rss>

