Tech ONTAP Articles
Tech ONTAP Articles
Automated storage tiering (AST) technologies are primarily intended to help data centers benefit from the improved performance of Flash-based media while minimizing cost and complexity. Flash-based devices such as solid-state disk (SSD) controller-based Flash can complete 25 to 100 times more random read operations per second than the fastest hard disk drives (HDDs), but that performance comes at a premium of 15 to 20 times higher cost per gigabyte. HDDs continue to improve in capacity, but HDD performance in terms of IOPS per dollar is relatively stagnant. Flash provides far more IOPS per dollar, plus lower latency. Figure 1) Comparison of the random read efficiency of different types of solid-state and rotational media on a logarithmic scale. Note that in terms of IOPS per dollar there is relatively little difference between different HDD types. Rather than permanently placing an entire dataset on expensive media, automated storage tiering tries to identify and store hot data on higher-performance storage media while storing cold data on slower, lower-cost media. NetApp has put a lot of time and energy into understanding the problems that AST must address in order to architect an optimal solution. This article describes:
Evaluating AST Technology From an I/O perspective, the primary goal of AST is to shift as much random I/O as possible to high-performance media (Flash) to minimize the random I/O burden on HDDs and reduce average latency. The distinction between random I/O and sequential I/O is important, because Flash has a relatively little price/performance advantage over HDDs for sequential reads and writes (because HDDs are good at sequential I/O). Figure 2) Comparison of the sequential throughput efficiency of different types of solid-state and rotational media. There are several factors that affect an AST solution’s ability to achieve the above goal:
From an operational standpoint, there are several additional factors worth considering:
Migration Versus Caching for AST There are two fundamentally different approaches to AST: migration and caching.
Figure 3) Caching-based versus migration-based automated storage tiering. The NetApp Virtual Storage Tier NetApp considered these two approaches to AST in light of the evaluation criteria we discussed earlier and concluded that a caching-based approach to AST did the better job of addressing those criteria. In addition, NetApp was able to focus on optimizing read performance because the NetApp Write Anywhere File Layout (WAFL®) effectively turns write activity into sequential writes, which—as Figure 2 illustrates—HDDs are good at. This is explained in detail in a recent blog post from Mike Riley and Tech OnTap® contributor John Fullbright. (This is also the reason that NetApp dual-parity RAID, or RAID-DP®, achieves good write performance where other RAID 6 implementations do not.) Figure 4) The NetApp Virtual Storage Tier is an approach to automated storage tiering based on caching. The NetApp Virtual Storage Tier promotes hot data to cache while keeping HDD I/O overhead to a minimum. Any time a read request is received for a block on a volume or LUN, that block is automatically subject to promotion. Note that promotion of a data block is not data migration, because the block remains on HDD when it is copied into the Virtual Storage Tier. Promotion happens directly from the system buffer cache, so no extra HDD I/O needed. Because data blocks can be immediately promoted after the first read from disk, no additional disk I/O is needed. By comparison, migration-based AST implementations typically don’t promote hot data until it has been read from disk several times or until the next scheduled migration, and then additional disk I/O is required to accomplish the migration process. NetApp algorithms distinguish high-value data from low-value data and retain that data in the Virtual Storage Tier. Metadata, for example, is always promoted on first read. In contrast, sequential reads are normally not cached in the Virtual Storage Tier unless specifically enabled because they tend to crowd out more valuable data, and HDDs do this well, as we’ve seen. You can change this behavior to meet the requirements of applications with unique data access behaviors or different service-level requirements. Virtual Storage Tier Advantages Real-time promotion of hot data with high granularity. A data block typically enters the Virtual Storage Tier the first time it is read from disk. The performance benefit occurs in real time as subsequent reads are satisfied from the Virtual Storage Tier. Patterns of read behavior are identified and blocks of data that are likely to be needed are read ahead of time, but the Virtual Storage Tier never does wholesale movement of data from one tier of storage to another. This keeps usage of HDD I/O and other system resources to a minimum. The efficiency of this approach, combined with the ability to operate at the granularity of a single 4KB block, allows real-time promotion of hot data. With migration-based AST, hot data is migrated from one storage tier to another either as a background task or scheduled during off-peak hours (to minimize the extra load on the storage system). Because these solutions typically operate at a level of granularity that is a minimum of 128 times higher than the Virtual Storage Tier (ranging from 0.5MB up to 1GB or even an entire volume or LUN), data movement can take considerable time. Such approaches may miss important spikes of activity when those spikes have a shorter duration than the time needed to identify and promote hot data. The 4KB granularity of the Virtual Storage Tier means that it uses Flash-based media very efficiently. Solutions with coarser granularity are likely to include a lot of “cold” data along with each hot data block, and are therefore likely to require a greater amount of expensive Flash media to achieve the same results. Easy to deploy and simple to manage. The Virtual Storage Tier works with existing data volumes and LUNs. It requires no complicated or disruptive changes to your storage environment. There is no need to set policies, thresholds, or time windows for data movement. You simply install Flash technology in your storage systems. Once this is accomplished, the Virtual Storage Tier becomes active for all volumes managed by the storage controller. You can then exclude user data for lower-priority volumes from the Virtual Storage Tier if desired. Other AST solutions require incremental policy, data classification, and structural changes to existing storage infrastructure such as the creation of dedicated storage pools and migration of data. Fully integrated. The Virtual Storage Tier is fully integrated with the NetApp Unified Storage Architecture, which means that you can use it with any NAS or SAN storage protocol with no changes. In addition, migration-based AST solutions may not interoperate with storage efficiency features such as deduplication. The NetApp Virtual Storage Tier works in conjunction with all NetApp storage efficiency features, including thin provisioning, FlexClone® technology, deduplication, and compression, and this close integration works to your advantage and enhances the functioning of the Virtual Storage Tier. For example, when you deduplicate a volume, the benefits of deduplication persist in the Virtual Storage Tier. A single block in the Virtual Storage Tier could have many metadata pointers to it, increasing the probability that it will be read again, and thus increasing the value of promoting that block. With this cache amplification a single block in the Virtual Storage Tier can serve as several logical blocks. This can yield significant performance benefits for server and desktop virtualization environments (such as shortening the duration of boot storms) while reducing the amount of Flash media needed. Conclusion Our caching-based approach to AST gives the NetApp Virtual Storage Tier significant advantages over migration-based AST. The Virtual Storage Tier is able to promote data in real time so even short spikes of activity benefit from acceleration. Our 4KB granularity means that we exclude cold data from Flash very efficiently, so you need less Flash to achieve a good result. By comparison, migration-based AST is less granular, has a longer time delay before data is promoted, requires more HDD I/O, and uses expensive Flash-based media less efficiently. In effect, the Virtual Storage Tier uses HDDs as a capacity tier and Flash as a performance tier. You probably have a variety of disk drive types, such as FC, SATA, and SAS. Any of these can serve as a capacity tier while the Virtual Storage Tier provides performance. We believe the combination of a high-performance tier (based on the Virtual Storage Tier) and a single disk-drive tier (based on SATA disk) makes the most sense for the majority of applications going forward. Got opinions about the NetApp Virtual Storage Tier? Ask questions, exchange ideas, and share your thoughts online in NetApp Communities. | More on VST Want to learn more about VST? A just-published white paper has all the details, including measured performance in a number of environments. Virtual Desktops and VST The NetApp Virtual Storage Tier has a tremendous impact on virtual desktop environments. A recent blog post from Vaughn Stewart describes the extreme loads that these environments can create and explains how VST can reduce spindle count and increase the number of desktops a storage environment can support while boosting overall data throughput. Intelligent Caching Flash Cache can cut your storage costs by reducing the number of spindles needed for a given level of performance by as much as 75% and by allowing you to replace high-performance disks with more economical options. Read more about this game-changing technology. | ||||||||||||||||||
|
All content posted on the NetApp Community is publicly searchable and viewable. Participation in the NetApp Community is voluntary.
In accordance with our Code of Conduct and Community Terms of Use, DO NOT post or attach the following:
Continued non-compliance may result in NetApp Community account restrictions or termination.
I was very exited when I read the highlight 'Automated Storage Tiering' in the TechOnTap message.
When I actually read the article I was so disappointed that I had to post this comment.
To give you some background, we are a company that is using NetApp filers already for a long time and we've been doing that with great joy. But what exactly is NetApp trying to make me believe here.
They do not offer AST but the are suggesting that their FlexCache is actually meant to be AST, but then even better.
My advise to NetApp would be to not let us believe in things that are not there. Please start implementing AST and let us customers decide, on which storage boxes, workloads and data sets, we can turn this on.
Flexcache and AST can (should) coexist and complement each other. Only AST however will help us to automatically move old blocks of data to the cheapest possible storage tier.
I hope that within NetApp you don't start believing yourselves that you are offering AST because I think you will miss the boat on this one.
Hi Robert,
Sorry for any confusion but our intent here is to differentiate the caching approach with that of automated data migration. When the AST topic comes up, we start with a discussion of the specific customer problems that are at stake and this usually comes down to a combination of cost, efficiency and performance. We do believe that there is a legitimate "caching tier" but I understand your point about AST terminology. Our intent was not to mis-categorize the NetApp approach but to make sure that it is getting appropriate visibility in the context of the AST dialog. Regarding the movement of old blocks of data, we believe that this type of migration is best performed with more user control of the process while our VST approach is intended to enable optimization of a single disk tier with Flash with Flash absorbing the high performance IOPS.