Solved: Oracle ASM and NetApp C-Mode Cluster

darraghos · ‎2020-12-04

It's been suggested to me (by a DB vendor) to use Oracle ASM to manage the disks presented to Linux for our servers hosting Oracle DBs. The disks are hosted on a C-Mode 9.7 array (All Flash). I cannot see the benefit of this with an all flash array. (RAID10 DP and WAFL). ASM seems to utilize stripping at the OS layer and manages the disk groupings. Perhaps on a very I/O intensive workload this would be beneficial but with our NetApp array being all flash seems to me to needlessly complicate matters.

Some aspects of ASM that concern me:

Does NetApp Snapcentre 4.4. suppor t it (seems to)
Does this tie the disk to only the Oracle layer i.e. seems like you cannot write an arbitrary file to these disk with using ASM tooling? What is the impact for other tools (AV, backup tools)
Other aspects?

I'd be very interested to here from people that are using ASM with NetApp arrays and what their use cases were.

steiner · ‎2020-12-07

I've never see a good reason to use a 1:1 mapping of volumes to LUNs. I know a lot of customers like that idea, but it's not something I've ever done. I bought my first NetApp in 1995, so I've been doing this for a while.

My usual practice is to group related LUNs in a single volume. I'll put ASM diskgroup #1 in a single volume, and ASM diskgroup #2 in another volume. I can now protect/restore/clone/QoS/etc all of the LUNs in the volume with a single operation. It does still help to have multiple LUNs. It's mostly related to the SCSI protocol itself. A LUN still does map to all the drives in the aggregate, but the SCSI protocol won't let you put an unlimited amount of IO on a single LUN device. 8 LUNs will get you close to the maximum possible, although 16 does offer some measurable improvements.

(side note - yes, there are situations where you want to bring multiple volumes into play for an Oracle database, but unless you need 250K+ IOPS it's not usually necessary.)

The same principles apply outside ASM. We had a customer recently with ext4 filesystems and we spent a lot of time experimenting with LVM striping. Same situation - stripe across 8 LUNs. In addition, tune the striping. If you have 8 LUNs, use a stripe width of 128K. That way, when Oracle does those big 1MB IO's during full table scans or RMAN backups, it can read 1MB at a time, hitting all 8 LUNs at the same time, in parallel. The performance boost was huge over a single LUN.

In theory, NVMe namespaces would eliminate a lot of this complexity because it removes SCSI from the equation. In practice, nobody is likely to offer a storage array where a single NVMe namespace can consume all the potential performance on the whole array. We'll probably need to keep making ASM diskgroups out of perhaps 4 to 8 NVMe namespaces. I'm not aware of any formal testing of this yet.

View solution in original post

steiner · ‎2020-12-04

ASM is definitely still useful. Most of the reason is related to the OS. Even if a single LUN could support 1M IOPS, the OS won't be able to move that much data. We find the sweet spot is usually around 8 to 16 LUNs total. AIX tends to need even more because of limits on the number of in-flight SCSI operations. A second benefit is growing the ASM diskgroup. You can resize ASM LUNs, but most customers prefer to grow in increments of an individual LUN size. The more LUNs you have, the more granular the growth.

Pushing beyond 20 or so LUNs is usually a waste of time and effort. There are exceptions, of course, but unless you need 250K IOPS it's probably not helpful.

You can also use LVM striping and get similar benefits. Most customers seem to prefer ASM, but we've seen a definitely increase in xfs on striped LVM. I will be updated TR-3633 with additional details on this in the few weeks.

darraghos · ‎2020-12-06

Thanks for the reply. I have some further questions:

So for a given DB servers we tend to keep a single LUN-to-Volume mapping at the C-Mode and and all those volumes in a single aggregate (for flexcloning). So striping at LUN level would still improve perf against and all flash array? Does the aggregate not take care of abstracting of the underlying disks or are LUNS tied to subsets of disks in this aggregate?
It seem also that LUN sizing with ASM needs close attention particularity if using flexclones extensively. If I read you correctly then in an architecture without ASM a single 100GB LUN might be presented to an OS but with ASM we might present 10x10GB LUNS and then layer on a disk group with ASM

steiner · ‎2020-12-07

I've never see a good reason to use a 1:1 mapping of volumes to LUNs. I know a lot of customers like that idea, but it's not something I've ever done. I bought my first NetApp in 1995, so I've been doing this for a while.

My usual practice is to group related LUNs in a single volume. I'll put ASM diskgroup #1 in a single volume, and ASM diskgroup #2 in another volume. I can now protect/restore/clone/QoS/etc all of the LUNs in the volume with a single operation. It does still help to have multiple LUNs. It's mostly related to the SCSI protocol itself. A LUN still does map to all the drives in the aggregate, but the SCSI protocol won't let you put an unlimited amount of IO on a single LUN device. 8 LUNs will get you close to the maximum possible, although 16 does offer some measurable improvements.

(side note - yes, there are situations where you want to bring multiple volumes into play for an Oracle database, but unless you need 250K+ IOPS it's not usually necessary.)

The same principles apply outside ASM. We had a customer recently with ext4 filesystems and we spent a lot of time experimenting with LVM striping. Same situation - stripe across 8 LUNs. In addition, tune the striping. If you have 8 LUNs, use a stripe width of 128K. That way, when Oracle does those big 1MB IO's during full table scans or RMAN backups, it can read 1MB at a time, hitting all 8 LUNs at the same time, in parallel. The performance boost was huge over a single LUN.

In theory, NVMe namespaces would eliminate a lot of this complexity because it removes SCSI from the equation. In practice, nobody is likely to offer a storage array where a single NVMe namespace can consume all the potential performance on the whole array. We'll probably need to keep making ASM diskgroups out of perhaps 4 to 8 NVMe namespaces. I'm not aware of any formal testing of this yet.

darraghos · ‎2020-12-07

Thanks @steiner for the detailed reply. Much appreciated. I guess my takeway here is 'it depends' on our I/O loads. I agree though I don't really see a good use case for 1-to-1 LUN-to-VOL mappings.