ONTAP Discussions
ONTAP Discussions
We just dropped in two 1TB Flash Cache cards in to our CDOT 8.2.2 cluster this morning. The cards show up fine on each controller and it looks to be enabled but we are seeing a 0% hit percentage. What am I missing? I know the cards need to warm-up but the filer has been running for almost 6 hours and it is still at 0%...
ntap-cl01::> node run -node ntap-cl01-node01 -command options flexscale
flexscale.enable on (same value in local+partner recommended)
flexscale.lopri_blocks on (same value in local+partner recommended)
flexscale.normal_data_blocks on (same value in local+partner recommended)
flexscale.pcs_high_res off (same value in local+partner recommended)
flexscale.pcs_size 0GB (same value in local+partner recommended)
flexscale.readahead_blocks off (same value in local+partner recommended)
flexscale.rewarm on (same value in local+partner recommended)
ntap-cl01::> node run -node ntap-cl01-node01 -command stats show -p flexscale-access
Cache Reads Writes Disk Reads
Usage Hit Meta Miss Hit Evict Inval Insert Chain Blocks Chain Blocks Replaced
% /s /s /s % /s /s /s /s /s /s /s /s
0 0 0 0 100 0 0 2 0 0 0 0 0
0 0 0 0 75 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 3 0 0 0 12 0
0 0 0 0 100 0 0 3 0 0 0 0 0
ntap-cl01::> node run -node ntap-cl01-node01 -command sysstat -x 1
CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk OTHER FCP iSCSI FCP kB/s iSCSI kB/s
in out read write read write age hit time ty util in out in out
23% 2403 0 0 2426 6829 167776 169400 24 0 0 42s 95% 0% - 81% 23 0 0 0 0 0 0
19% 3226 0 0 3226 13340 161935 166164 0 0 0 0s 96% 0% - 83% 0 0 0 0 0 0 0
36% 2388 0 0 2388 8945 138074 157052 42580 0 0 1 98% 41% Tf 84% 0 0 0 0 0 0 0
21% 2046 0 0 2046 5482 131457 167128 121088 0 0 0s 94% 100% :f 89% 0 0 0 0 0 0 0
32% 3232 0 0 3668 10094 202315 212984 115824 0 0 0s 94% 100% :f 82% 436 0 0 0 0 0 0
27% 3157 0 0 3173 7040 216534 220044 116224 0 0 55s 96% 100% :f 81% 16 0 0 0 0 0 0
28% 3746 0 0 3752 5845 245167 252088 125328 0 0 0s 96% 100% :f 82% 6 0 0 0 0 0 0
26% 3020 0 0 3020 5884 200317 205844 61052 0 0 57s 96% 66% : 85% 0 0 0 0 0 0 0
31% 2980 0 0 3001 13137 183224 186476 0 0 0 0s 95% 0% - 90% 21 0 0 0 0 0 0
25% 2905 0 0 3480 14162 163218 163084 24 0 0 0s 96% 0% - 84% 575 0 0 0 0 0 0
33% 2123 0 0 2308 13284 124891 156168 78696 0 0 59s 97% 46% Sf 95% 185 0 0 0 0 0 0
25% 3029 0 0 3035 5532 200730 203496 113984 0 0 0s 93% 100% :f 86% 6 0 0 0 0 0 0
29% 3385 0 0 3386 5786 217468 234453 130938 0 0 1 97% 99% Zs 83% 1 0 0 0 0 0 0
19% 2645 0 0 2645 9259 129553 140667 69682 0 0 0s 97% 36% : 92% 0 0 0 0 0 0 0
- is that Flash Cache 2 (PAM2)? do you know the part number?
- why is 'lopri' on?
- is it showing in 'sysconfig -a'?
It does show in sysconfig. I turned on lopri in an effort to try to get it to do at least something. I am banging my head against the wall. Am I just to impatient? Does it need a day before it starts to "learn" what to cache?
slot 2: Flash Cache 2 (1.0 TB)
State: Enabled
Model Name: X1974A-R6
Serial Number: 9436768886
Part Number: 111-00903
Board Revision: B0
FPGA Release: 3.3
FPGA Build: 201312121541
memory mapped I/O base 0x0000000044800000, size 0x80000
- "Am I just to impatient?" - No. it should show inserts
- so you put one in each node and none works?
- is slot 1 available?
- would you be willing to do a takeover/giveback from both sides?
There is a card in each node, and neither is working. I just performed a takeover/giveback and it still doesn't seem to help. Slot 1 is free. Does it require the card in slot 1?
ntap-cl01::> node run -node ntap-cl01-node02 -command stats show -p flexscale-access
Cache Reads Writes Disk Reads
Usage Hit Meta Miss Hit Evict Inval Insert Chain Blocks Chain Blocks Replaced
% /s /s /s % /s /s /s /s /s /s /s /s
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 61 0 0 0 0 0 0 0 0 0
0 0 0 393 0 0 0 0 0 0 0 0 0
0 0 0 428 0 0 0 0 0 0 0 0 0
0 0 0 120 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
Takeover/giveback had no change, still 0% usage. We added the same card to both nodes and both have the same issue. Slot 1 is empty, but the Hardware universe says slot 1 or 2 should be fine.
MK,
beside flexscale ON, what other things you do ?
I think your process has not completed yet.
Can you run down the birds eye view of the steps you did.
You have do something at aggr level. The IOps should skyrockt once you add those in aggr level.
System Manager tool - you can also use.
I am afraid, that just simply enabling that scale ON would do anything, you have to bring those inside your aggr level.
I figured it out. Our data aggr's are hybrid with SSD Flashpools. Appraently, unbeknownst to me, you cannot use Flash Cache on Hybrid Flash Pool aggrs. If I am wrong please let me know? Otherwise it looks like I am returning the Flash Cache cards and adding a few more SSDs to the Flash Pool.
http://www.netapp.com/us/system/pdf-reader.aspx?m=tr-3832.pdf&cc=us
5.4 Flash Pool
Flash Cache and Flash Pool can be used together on one node (controller), within an HA pair, and within
a single clustered Data ONTAP cluster. Data from volumes that are provisioned in a Flash Pool
aggregate or an all-SSD aggregate is automatically excluded from being cached in Flash Cache. Data
from volumes provisioned in a Flash Pool aggregate uses Flash Pool cache, and data from volumes
provisioned in an SSD aggregate would not benefit from using Flash Cache.
Although Flash Cache and Flash Pool can be configured on the same node or within the same HA pair,
there are limits on the total cache that can be configured. The maximum cache sizes when using Flash
Cache and Flash Pool vary by controller model and Data ONTAP release; detailed information can be
found in the NetApp Hardware Universe, for customers who have NetApp Support site access.
- all of your aggregates are FPs?
- FP is aggregate based cache while Flash Cache is system wide
- while both help with random reads, FP also helps with overwrites
- what was the need to provision Flash Cache given that you have FP?
- did you ever run PCS to predict the benefit?
We ran wafl awa to determine that our FP was undersized. We had the option of obtaining Flash Cache cards, and I thought, without rearching enough, the the two combined would help. We only have one large data aggr per controller and they are both FP hybrids. It sounds like I should return the FC cards and expand the FP SSDs.
ntap-cl01::> system node run -node * "priv set advanced;wafl awa print"
2 entries were acted on.
Node: ntap-cl01-node01
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by NetApp
personnel.
### FP AWA Stats ###
Host ntap-cl01-node01 Memory 20166 MB
ONTAP Version NetApp Release 8.2.2 Cluster-Mode: Fri Aug 22 01:46:52 PDT 2014
AWA Version 1
Layout Version 1
CM Version 1
Basic Information
Aggregate aggr_data_node01
Current-time Wed Jan 14 09:08:03 EST 2015
Start-time Tue Jan 13 13:51:56 EST 2015
Total runtime (sec) 69368
Interval length (sec) 600
Total intervals 116
In-core Intervals 1024
Summary of the past 116 intervals
max
Read Throughput 300.948 MB/s
Write Throughput 28.250 MB/s
Cacheable Read (%) 65 %
Cacheable Write (%) 27 %
Max Projected Cache Size 1168 GiB
Projected Read Offload 16 %
Projected Write Offload 27 %
Summary Cache Hit Rate vs. Cache Size
Size 20% 40% 60% 80% 100%
Read Hit 9 10 12 13 16
Write Hit 23 24 24 24 27
The entire results and output of Automated Workload Analyzer (AWA) are
estimates. The format, syntax, CLI, results and output of AWA may
change in future Data ONTAP releases. AWA reports the projected cache
size in capacity. It does not make recommendations regarding the
number of data SSDs required. Please follow the guidelines for
configuring and deploying Flash Pool; that are provided in tools and
collateral documents. These include verifying the platform cache size
maximums and minimum number and maximum number of data SSDs.
### FP AWA Stats End ###
Node: ntap-cl01-node02
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by NetApp
personnel.
### FP AWA Stats ###
Host ntap-cl01-node02 Memory 20166 MB
ONTAP Version NetApp Release 8.2.2 Cluster-Mode: Fri Aug 22 01:46:52 PDT 2014
AWA Version 1
Layout Version 1
CM Version 1
Basic Information
Aggregate aggr_data_node02
Current-time Wed Jan 14 09:08:03 EST 2015
Start-time Tue Jan 13 13:51:47 EST 2015
Total runtime (sec) 69376
Interval length (sec) 600
Total intervals 116
In-core Intervals 1024
Summary of the past 116 intervals
max
Read Throughput 301.032 MB/s
Write Throughput 38.832 MB/s
Cacheable Read (%) 83 %
Cacheable Write (%) 34 %
Max Projected Cache Size 1423 GiB
Projected Read Offload 39 %
Projected Write Offload 35 %
Summary Cache Hit Rate vs. Cache Size
Size 20% 40% 60% 80% 100%
Read Hit 24 28 31 32 39
Write Hit 30 31 31 31 35
The entire results and output of Automated Workload Analyzer (AWA) are
estimates. The format, syntax, CLI, results and output of AWA may
change in future Data ONTAP releases. AWA reports the projected cache
size in capacity. It does not make recommendations regarding the
number of data SSDs required. Please follow the guidelines for
configuring and deploying Flash Pool; that are provided in tools and
collateral documents. These include verifying the platform cache size
maximums and minimum number and maximum number of data SSDs.
### FP AWA Stats End ###
- max combined cache on a 8020 HA pair (FP and FlashCache) is 6TB. If exeeded, you should get something like this:
"[localhost:raid.hybrid.SSDTotExceed:error]: The sum of sizes of SSD disks of hybrid aggregates <actual capacity> exceeds the <HA pair limit> maximum"
there is also this from the guide:
"While use of both Flash Cache and Flash Pool on the same system is supported, the best practice recommendation is to use only one caching product per controller or system, whenever possible"