Tech ONTAP Blogs

User access mapping with Amazon S3 Access Points for Amazon FSx for NetApp ONTAP

ScottMo
NetApp
32 Views

Amazon Web Service (AWS) and NetApp recently announced integration of Amazon FSx for NetApp ONTAP (FSx for ONTAP) with a wide variety of AI and machine learning (AI/ML) and analytics services using Amazon S3 Access Points. You can now attach S3 access points to your FSx for ONTAP file systems, enabling access to file data as if it were stored in S3. This allows existing datasets to be used seamlessly with S3-based AI/ML and analytics tools, while the data itself continues to reside on FSx for ONTAP. This exciting new offering opens new ways to innovate and to extract value from existing data you already have at your fingertips. In this post, I will show you some examples of how to match users across protocols to facilitate access and how to help users get started with Amazon S3 Access Points.

 

Common use case example

Let’s cover some common use cases for UNIX workloads using network file system (NFS) as the primary dataset. This case configuration uses Microsoft Active Directory (AD) as a Lightweight Directory Access Protocol (LDAP) source for users, groups, and passwords. FSx for ONTAP and the Amazon Linux 2023 (AL2023) clients, using System Security Services Daemon (SSSD), are bound to the LDAP server.

Imagine a repository with more than 20 years of specifications, test reports, technical reports, and setup and configuration guides. This unstructured data is valuable but was largely unusable. Until now.

 

How to bring AI/ML to where your data lives

The first step is to create an S3 access point using the AWS Command Line Interface (AWS CLI) or AWS Management Console:

 

aws --region us-east-1 fsx create-and-attach-s3-access-point --name vol1s3 --type ONTAP --ontap-configuration "FileSystemIdentity={Type=UNIX,UnixUser={Name=s3user}},VolumeId=fsvol-07688d71234asdf1234"

 

Dissecting this command, note that a Name for the access point, a Type, a UnixUser, and a VolumeID are provided. Special attention should be paid to the Type. In this use case, we’re focusing on UNIX and NFS style access so UNIX is chosen. UnixUser is the username that all S3 access point operations will use when accessing the files on the FSx for ONTAP volume. More on this later. VolumeID is the AWS representation of the FSx for ONTAP volume and can be obtained using the command ‘aws fsx describe-volumes’.

The result of this command is that an S3 access point is now created for direct access to the dataset that resides on the FSx for ONTAP filesystem and AI/ML services—such as Amazon Kendra—can now analyze, process, and extract value from what was previously an unusable collection of unstructured data.

 

User mapping and security

As previously mentioned, when creating an S3 access point, a user is configured for mapping access to files and directories on your storage filesystem. All access through the access point uses the UNIX user ID (UID) that was mapped by the FSx for ONTAP file system to control access. If the configured user can access the files and directories through the NFS protocol, they can perform an S3 GetObject and PutObject to the same files or directories as they’re allowed to with NFS. Files created by the PutObject action through the access point will get the UID and GID of the UnixUser specified when the access point was created.

The following is an example of an object PUT to the FSx for ONTAP filesystem through the S3 access point using the alias as the bucket target with the AWS CLI:

 

aws s3api put-object --key reports/impact.pdf --body /Users/testuser/impact.pdf --bucket vol1s3-4uycqhxjyf55d8zj66j4siys75eoouse1a-ext-s3alias

 

On the NFS mounted filesystem:

-rw-r--r--. 1 testuser testgroup   3888405 Jan 14 20:34 impact.pdf

testuser was the configured UnixUser for the S3 access point.

 

Access control options

There are several options to choose from for controlling access to the dataset through S3 Access Points:

  • A service account user can be created for consolidating access control.
  • If required, access can be limited to specific parts of the dataset with a combination of specific UNIX groups and file permissions using standard UNIX commands such as chown, chgrp, chmod.
  • FSx for ONTAP also supports multiple access points per volume. This grants the ability to present the same volume to different users who might have differing levels of access, read-only or read-write, or different access to specific files and directories within the larger data volume.

The following diagram illustrates one method for controlling access. Here, there’s an S3 access point for each AWS service with access to the same data source directory.

blog-diagram-1.png

 

Great, but what about my Windows users?

All these concepts also apply to Common Internet File System (CIFS) or Server Message Block (SMB). In this use case, both the Windows client and AD are Microsoft Windows Server 2022. The FSx for ONTAP filesystem has been joined to the AD domain, and CIFS shares have been created for the volume that contains the working dataset.

The next step is to create an S3 access point for Windows:

 

aws --region us-east-1 fsx create-and-attach-s3-access-point --name vol2s3 --type ONTAP --ontap-configuration "FileSystemIdentity={Type=WINDOWS,WindowsUser={Name=wins3user}},VolumeId=fsvol-0e64ce10123456789"

 

Similar to before, providing the VolumeID, Type, and WindowsUser.

In this case, the underlying volume that corresponds to fsvol-0e64ce10123456789 is a new technology file system (NTFS) style FSx for ONTAP volume.

The service account s3winuser is used for all incoming and outgoing operations on the data in the configured volume when accessed through the S3 access point. FSx for ONTAP looks up that user in AD and uses those credentials when determining if the configured user has permissions to create, read, write, update, or delete files within the configured volume by comparing the security identifier (SID) against the NTFS access control list (ACL) as the operation traverses the filesystem path.

 

Also great, but what about mixed environments?

FSx for ONTAP supports three types of volume permissions styles. UNIX, NTFS, and Mixed. It’s recommended to match the security style to the primary workload access type. For example, if your primary workload mainly uses Linux or other UNIX variants, select a UNIX style volume and a UNIX type access point. Note that even though a UNIX volume is used, Windows users can still access this data when a CIFS share is created for this volume.

For complete details and best practices see the Multiprotocol NAS in NetApp ONTAP and How to Configure LDAP in ONTAP technical reports.

For this case, we will use a UNIX-style volume and a UNIX-type access point. The environment consists of a Windows 2022 Server, acting as an AD and LDAP server for Linux clients. FSx for ONTAP has been joined to the AD Forest.

 

aws --region us-east-1 fsx create-and-attach-s3-access-point --name vol1s3 --type ONTAP --ontap-configuration "FileSystemIdentity={Type=UNIX,UnixUser={Name=s3user}},VolumeId=fsvol-0e64ce10123456789"

 

When the access point is created and a UNIX Type is specified, the AWS control plane will automatically configure a rule that will attempt to map the Amazon S3 user to a UNIX user.

When accessing this access point from an Amazon S3 client, all operations will be mapped to the user specified for this access point, in this case s3user. Name mapping is only done at the user level, group names aren’t mapped. Group membership is gathered after the name mapping is complete.

FSx for ONTAP attempts to match users across protocols using implicit name mapping. Because AD is the identity service, FSx for ONTAP retrieves identity information from AD. When an Amazon S3 operation enters the access point, the provided username s3user, is used to gather user information—such as the UID—to check access to the filesystem. FSx for ONTAP uses the UID to authorize or prevent access based on the UNIX permissions on the path and file from the requested S3 operation.

A simplified visualization of an operation:

blog-seq-1.png

 

 

Summary

Now, you’re armed with the basic concepts of access control and user mapping using Amazon S3 Access Points for Amazon FSx for ONTAP. You’re prepared to run a proof of concept or begin designing a robust data pipeline to put your data to work and extract the latent value within.

Visit the documentation to learn more about how to manage your access points’ access.

 

Public