Accessing StorageGRID Webscale through its S3 API

In this post we'll look into how you can access objects in StorageGRID Webscale through its S3 API with the AWS SDK. StorageGRID Webscale is an enterprise grade object store, which you can run in your own datacenter. Our marketing people tell us the following about it:

 

You can store massive amounts of data in a single, elastic content store, by using NetApp® StorageGRID® Webscale software. StorageGRID Webscale is a purpose-built, software-defined storage solution for large archives, media repositories, and web data stores. It’s designed for the hybrid cloud and supports standard protocols, including Amazon S3 and SNIA’s CDMI—so you can run your apps on premises or in the cloud. If you want to learn more about StorageGRID Webscale, see this link.

 

You may have guessed it by now: StorageGRID Webscale is compatible with the AWS SDK. This is a huge advantage, as the AWS SDK comes with support for many different languages (Java, .NET, PHP, Phython, Ruby, JS, ...).

 

Let's see what's necessary to access objects in the grid via S3. First, create a S3 account within StorageGRID. Log into the StorageGRID webinterface and goto Grid Management -> S3 Management. Then click the + button. In the following screen, enter your account name and download/copy the access & secret keys:

 

add_cred.png 

Next, put your the access and secret keys into a credentials file. In my case, I've put it in ~/.aws/credentials, which is the default location where the AWS SDK expects the credentials. I've created a new profile called 'webscale':

> cat ~/.aws/credentials
[webscale] aws_access_key_id = [paste access key here] aws_secret_access_key = [paste secret here]

StorageGRID Webscale exposes its API endpoints (S3, CDMI) through its gateway nodes. Therefore, we can either connect directly to a gateway node or we can hide multiple gateways behind a load balancer. Regardless of the entry point, the S3 user can see all his/her buckets and objects, regardless of their physical location (e.g., different site) . Per default, the S3 endpoint is running at port 8082. So, let's fire up our favorite Java IDE and run a quick test. In my case, I am using the aws-java-sdk-1.9.8 libraries. 

    public static void main(String[] args) {
final String profile = "webscale"; final String address = "https://sg-gateway1.mycompany.com:8082";
// use path-style access to avoid issues with DNS-incompatible bucket names final S3ClientOptions options = new S3ClientOptions(); options.setPathStyleAccess(true);
// will load credentials from ~/.aws/credentials final AmazonS3Client s3 = new AmazonS3Client(new ProfileCredentialsProvider(profile)); s3.setEndpoint(address); s3.setS3ClientOptions(options); Owner owner = s3.getS3AccountOwner(); System.out.println("Owner: " + owner.getId()); for (Bucket b : s3.listBuckets()) { System.out.println("Bucket: " + b); } System.out.println("Found " + buckets.size() + " buckets"); }

In my case, this gives me the following results: 

Owner: 2ebdc71df29c392e2ae9796b5babe100d8275a83a9ce55a90cd5387a292a8503
Bucket: S3Bucket [name=Test_Bucket, creationDate=Fri Dec 05 15:12:08 CET 2014, owner=S3Owner [name=Blog test,id= 2ebdc71df29c392e2ae9796b5babe100d8275a83a9ce55a90cd5387a292a8503]]
...
Bucket: S3Bucket [name=clemens-test-bucket, creationDate=Mon Dec 08 13:34:20 CET 2014, owner=S3Owner [name=Blog Test,id= 2ebdc71df29c392e2ae9796b5babe100d8275a83a9ce55a90cd5387a292a8503]]
Found 5 buckets

From here, it is straightfoward to ingest, retrieve, and delete objects from the grid (hint: AWS SDK documentation).

 

One word regarding certificates and HTTPS:

  • The recommended and most secure way is installing your own certificate on the StorageGRID webscale server. This certificate will be used on all gateway nodes. The out-of-the-box certificate that comes with StorageGRID Webscale most likely won't match the hostnames of the gateway nodes in your setup.
  • If you use theout-of-the-box certificate, you will need to disable hostname verification (due to the different hostnames). However, this is insecure and enables MITM attacks. Therefore, don't think about using it in production. 

One final note regarding S3 compatibility. Similar to most (if not all) S3 implementations on the market, StorageGRID Webscale does not support every single method or flag that AWS has introduced over the last years. Obviously, all of the most-used methods are supported. Since this number of supported methods will most likely increase in the future, please refer to the Simple Storage Service Implementation Guide document for current information (access may require a NetApp NOW account).

 

If you have any questions, let me know!

 

Follow me on Twitter: @clemenssiebler

 

Comments

Hi Clemens,

 

I am a user of WebScale and now I am putting images into WebScale through S3 API, what troubles me is how can I access these images through URL directly like the old CDMI API without using the S3 get methon.

 

Can WebScale generates access URL automaticlly or do you have any suggestion?

 

Thanks

 

 

KR_Wang

Frequent Contributor

Hello KR Wang,

 

yes, this is possible by setting the bucket policy to public access. I've used s3cmd to achieve this:

 

https://gist.github.com/csiebler/d9a9ca654127f3c3f698

 

You can also take the bucket policy (see the json in the link), and apply it directly via a tool like S3Browser under Windows.

 

Best regards,

Clemens

Thank you for reply.

 

I set the policy in the S3Browser, and got a URL like https://BUCKET_NAME.IP_ADDRESS:8082/, but I could not access it as the browser couldn't resolve the URL.

 

Should I configurate a DNS to the GateWay Node?

 

Also the bucket_name is defined by users under their accounts, if there is a name conflict under different account, how could I access the right object through the URL?

 

 

Yours Sincerely,

 

Kairu

Frequent Contributor

Hi Kairu,

 

try https://IP-ADDRESS:8082/bucketname/objectname. This should work out of the box. Virtual-hostnames (as you tried), is supported too, but requires a little bit more setup.

 

 

Regards

Clemens

Hi, Clemens,

 

It works for me!

Thanks a lot!

 

Yours Sincerely,

 

Kairu

Hi,

 

Do you have any test for .net SDK?