ONTAP Discussions

Flash Cache 1TB upgrade CDOT FAS 8020

MK
7,644 Views

We just dropped in two 1TB Flash Cache cards in to our CDOT 8.2.2 cluster this morning.  The cards show up fine on each controller and it looks to be enabled but we are seeing a 0% hit percentage.  What am I missing?  I know the cards need to warm-up but the filer has been running for almost 6 hours and it is still at 0%...

 

ntap-cl01::> node run -node ntap-cl01-node01 -command options flexscale

flexscale.enable             on         (same value in local+partner recommended)
flexscale.lopri_blocks       on         (same value in local+partner recommended)
flexscale.normal_data_blocks on         (same value in local+partner recommended)
flexscale.pcs_high_res       off        (same value in local+partner recommended)
flexscale.pcs_size           0GB        (same value in local+partner recommended)
flexscale.readahead_blocks   off        (same value in local+partner recommended)
flexscale.rewarm             on         (same value in local+partner recommended)

 

ntap-cl01::> node run -node ntap-cl01-node01 -command stats show -p flexscale-access

 Cache                                               Reads       Writes      Disk Reads
 Usage    Hit   Meta   Miss Hit Evict Inval Insert Chain Blocks Chain Blocks  Replaced
     %     /s     /s     /s   %    /s    /s     /s    /s     /s    /s     /s        /s
     0      0      0      0 100     0     0      2     0      0     0      0         0
     0      0      0      0  75     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      3     0      0     0     12         0
     0      0      0      0 100     0     0      3     0      0     0      0         0

 

ntap-cl01::> node run -node ntap-cl01-node01 -command sysstat -x 1

 CPU    NFS   CIFS   HTTP   Total     Net   kB/s    Disk   kB/s    Tape   kB/s  Cache  Cache    CP  CP  Disk   OTHER    FCP  iSCSI     FCP   kB/s   iSCSI   kB/s
                                       in    out    read  write    read  write    age    hit  time  ty  util                            in    out      in    out
 23%   2403      0      0    2426    6829 167776  169400     24       0      0    42s    95%    0%  -    81%      23      0      0       0      0       0      0
 19%   3226      0      0    3226   13340 161935  166164      0       0      0     0s    96%    0%  -    83%       0      0      0       0      0       0      0
 36%   2388      0      0    2388    8945 138074  157052  42580       0      0     1     98%   41%  Tf   84%       0      0      0       0      0       0      0
 21%   2046      0      0    2046    5482 131457  167128 121088       0      0     0s    94%  100%  :f   89%       0      0      0       0      0       0      0
 32%   3232      0      0    3668   10094 202315  212984 115824       0      0     0s    94%  100%  :f   82%     436      0      0       0      0       0      0
 27%   3157      0      0    3173    7040 216534  220044 116224       0      0    55s    96%  100%  :f   81%      16      0      0       0      0       0      0
 28%   3746      0      0    3752    5845 245167  252088 125328       0      0     0s    96%  100%  :f   82%       6      0      0       0      0       0      0
 26%   3020      0      0    3020    5884 200317  205844  61052       0      0    57s    96%   66%  :    85%       0      0      0       0      0       0      0
 31%   2980      0      0    3001   13137 183224  186476      0       0      0     0s    95%    0%  -    90%      21      0      0       0      0       0      0
 25%   2905      0      0    3480   14162 163218  163084     24       0      0     0s    96%    0%  -    84%     575      0      0       0      0       0      0
 33%   2123      0      0    2308   13284 124891  156168  78696       0      0    59s    97%   46%  Sf   95%     185      0      0       0      0       0      0
 25%   3029      0      0    3035    5532 200730  203496 113984       0      0     0s    93%  100%  :f   86%       6      0      0       0      0       0      0
 29%   3385      0      0    3386    5786 217468  234453 130938       0      0     1     97%   99%  Zs   83%       1      0      0       0      0       0      0
 19%   2645      0      0    2645    9259 129553  140667  69682       0      0     0s    97%   36%  :    92%       0      0      0       0      0       0      0

 

10 REPLIES 10

JSHACHER11
7,556 Views

 

 

 - is that Flash Cache 2 (PAM2)? do you know the part number?

 - why is 'lopri' on?

 - is it showing in 'sysconfig -a'?

MK
7,549 Views

It does show in sysconfig.  I turned on lopri in an effort to try to get it to do at least something.  I am banging my head against the wall.  Am I just to impatient?  Does it need a day before it starts to "learn" what to cache?

 

 slot 2: Flash Cache 2 (1.0 TB)
                         State:     Enabled
                    Model Name:     X1974A-R6
                 Serial Number:     9436768886
                   Part Number:     111-00903
                Board Revision:     B0
                  FPGA Release:     3.3
                    FPGA Build:     201312121541
                        memory mapped I/O base 0x0000000044800000, size 0x80000

JSHACHER11
7,540 Views

 

 

 - "Am I just to impatient?" - No. it should show inserts

 - so you put one in each node and none works?

 - is slot 1 available?

 - would you be willing to do a takeover/giveback from both sides?

MK
7,385 Views

There is a card in each node, and neither is working.  I just performed a takeover/giveback and it still doesn't seem to help.  Slot 1 is free.  Does it require the card in slot 1?

 

ntap-cl01::> node run -node ntap-cl01-node02 -command stats show -p flexscale-access
 Cache                                               Reads       Writes      Disk Reads
 Usage    Hit   Meta   Miss Hit Evict Inval Insert Chain Blocks Chain Blocks  Replaced
     %     /s     /s     /s   %    /s    /s     /s    /s     /s    /s     /s        /s
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0     61   0     0     0      0     0      0     0      0         0
     0      0      0    393   0     0     0      0     0      0     0      0         0
     0      0      0    428   0     0     0      0     0      0     0      0         0
     0      0      0    120   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0
     0      0      0      0   0     0     0      0     0      0     0      0         0

MK
7,496 Views

Takeover/giveback had no change, still 0% usage.  We added the same card to both nodes and both have the same issue.  Slot 1 is empty, but the Hardware universe says slot 1 or 2 should be fine.

Uddhav
7,477 Views

MK,

 

beside flexscale ON,  what other things you do ?

 

I think your process has not completed yet.

 

Can you run down the birds eye view of the steps you did.

 

You have do something at aggr level.  The IOps  should skyrockt once you add those in  aggr level.

 

System Manager tool - you can also use.

 

I am afraid,  that just simply enabling that scale ON would do anything,  you have to bring those inside your aggr level.

MK
7,451 Views

I figured it out.  Our data aggr's are hybrid with SSD Flashpools.  Appraently, unbeknownst to me, you cannot use Flash Cache on Hybrid Flash Pool aggrs.  If I am wrong please let me know?  Otherwise it looks like I am returning the Flash Cache cards and adding a few more SSDs to the Flash Pool.

 

http://www.netapp.com/us/system/pdf-reader.aspx?m=tr-3832.pdf&cc=us

 

5.4 Flash Pool

 

Flash Cache and Flash Pool can be used together on one node (controller), within an HA pair, and within

a single clustered Data ONTAP cluster. Data from volumes that are provisioned in a Flash Pool

aggregate or an all-SSD aggregate is automatically excluded from being cached in Flash Cache. Data

from volumes provisioned in a Flash Pool aggregate uses Flash Pool cache, and data from volumes

provisioned in an SSD aggregate would not benefit from using Flash Cache.

Although Flash Cache and Flash Pool can be configured on the same node or within the same HA pair,

there are limits on the total cache that can be configured. The maximum cache sizes when using Flash

Cache and Flash Pool vary by controller model and Data ONTAP release; detailed information can be

found in the NetApp Hardware Universe, for customers who have NetApp Support site access.

JSHACHER11
7,446 Views

 

 

 - all of your aggregates are FPs?

 - FP is aggregate based cache while Flash Cache is system wide

 - while both help with random reads, FP also helps with overwrites

 - what was the need to provision Flash Cache given that you have FP?

 - did you ever run PCS to predict the benefit?

 

 

 

MK
7,427 Views

We ran wafl awa to determine that our FP was undersized. We had the option of obtaining Flash Cache cards, and I thought, without rearching enough, the the two combined would help.  We only have one large data aggr per controller and they are both FP hybrids.  It sounds like I should return the FC cards and expand the FP SSDs.

 

ntap-cl01::> system node run -node * "priv set advanced;wafl awa print"

2 entries were acted on.

 

Node: ntap-cl01-node01

Warning: These advanced commands are potentially dangerous; use

         them only when directed to do so by NetApp

         personnel.

### FP AWA Stats ###

 

                     Host ntap-cl01-node01                    Memory 20166 MB

            ONTAP Version NetApp Release 8.2.2 Cluster-Mode: Fri Aug 22 01:46:52 PDT 2014

              AWA Version 1

           Layout Version 1

               CM Version 1

 

Basic Information

 

                Aggregate aggr_data_node01

             Current-time Wed Jan 14 09:08:03 EST 2015

               Start-time Tue Jan 13 13:51:56 EST 2015

      Total runtime (sec) 69368

    Interval length (sec) 600

          Total intervals 116

        In-core Intervals 1024

 

Summary of the past 116 intervals

                                   max

          Read Throughput      300.948 MB/s

         Write Throughput       28.250 MB/s

       Cacheable Read (%)           65 %

      Cacheable Write (%)           27 %

Max Projected Cache Size         1168 GiB

   Projected Read Offload           16 %

  Projected Write Offload           27 %

 

Summary Cache Hit Rate vs. Cache Size

 

       Size        20%        40%        60%        80%       100%

   Read Hit          9         10         12         13         16

  Write Hit         23         24         24         24         27

 

The entire results and output of Automated Workload Analyzer (AWA) are

estimates. The format, syntax, CLI, results and output of AWA may

change in future Data ONTAP releases. AWA reports the projected cache

size in capacity. It does not make recommendations regarding the

number of data SSDs required. Please follow the guidelines for

configuring and deploying Flash Pool; that are provided in tools and

collateral documents. These include verifying the platform cache size

maximums and minimum number and maximum number of data SSDs.

 

 

 

### FP AWA Stats End ###

 

 

Node: ntap-cl01-node02

 

Warning: These advanced commands are potentially dangerous; use

         them only when directed to do so by NetApp

         personnel.

### FP AWA Stats ###

 

                     Host ntap-cl01-node02                    Memory 20166 MB

            ONTAP Version NetApp Release 8.2.2 Cluster-Mode: Fri Aug 22 01:46:52 PDT 2014

             AWA Version 1

           Layout Version 1

               CM Version 1

 

Basic Information

 

                Aggregate aggr_data_node02

             Current-time Wed Jan 14 09:08:03 EST 2015

               Start-time Tue Jan 13 13:51:47 EST 2015

      Total runtime (sec) 69376

    Interval length (sec) 600

          Total intervals 116

        In-core Intervals 1024

 

Summary of the past 116 intervals

                                   max

          Read Throughput      301.032 MB/s

         Write Throughput       38.832 MB/s

       Cacheable Read (%)           83 %

      Cacheable Write (%)           34 %

Max Projected Cache Size         1423 GiB

   Projected Read Offload           39 %

  Projected Write Offload           35 %

 

Summary Cache Hit Rate vs. Cache Size

 

       Size        20%        40%        60%        80%       100%

   Read Hit         24         28         31         32         39

  Write Hit         30         31         31         31         35

 

The entire results and output of Automated Workload Analyzer (AWA) are

estimates. The format, syntax, CLI, results and output of AWA may

change in future Data ONTAP releases. AWA reports the projected cache

size in capacity. It does not make recommendations regarding the

number of data SSDs required. Please follow the guidelines for

configuring and deploying Flash Pool; that are provided in tools and

collateral documents. These include verifying the platform cache size

maximums and minimum number and maximum number of data SSDs.

 

 

 

### FP AWA Stats End ###

 

 

JSHACHER11
4,844 Views

 

 

 - max combined cache on a 8020 HA pair (FP and FlashCache) is 6TB. If exeeded, you should get something like this:

 

"[localhost:raid.hybrid.SSDTotExceed:error]: The sum of sizes of SSD disks of hybrid aggregates <actual capacity> exceeds the <HA pair limit> maximum"

 

there is also this from the guide:

 

 

"While use of both Flash Cache and Flash Pool on the same system is supported, the best practice recommendation is to use only one caching product per controller or system, whenever possible"

Public