Object storage has become a regular topic of discussion in IT these days. Why the sudden interest? There are two separate (but related) reasons that object storage is currently being deliberated. I’ll describe both in this blog, but first, some basics.
As you probably know, today’s two common methods of data storage access, file and block, have been around for many decades. These two methods access data in different ways – and object storage brings yet a 3rd way to store and retrieve data.
File-level storage is a high level networked communication between servers and storage devices where the storage devices contain an integrated file system to store and retrieve data. In order to read or write data, a pathname is specified. This path points to a file system location using a directory tree hierarchy, expressed as a string of characters in which path components, separated by delimiting characters, represent each subdirectory. A familiar example:
The advantage of file storage is that the server OS (or application) does not need to map the data to the storage device, that responsibility is handled by the storage controller – you only need to remember the file path. The disadvantage of file storage is that the overhead of the file system usually results in slower access times to data.
Block-level storage, by contrast, uses a low-level logical block addressing (LBA) scheme which converts physical storage devices i.e. disk drives, into groups of logical storage addresses. This simply means the sectors of each disk (or tape, or SSD) are sequentially numbered starting with LBA number 0. Every sector is identified by its unique LBA number. LBA’s are mapped by the server’s operating system, or sometimes by applications running on the server, and specified by SCSI commands sent to storage devices from the server. For example:
The example above shows the structure of a simple WRITE command sent from the server. It is the responsibility of the server OS (or application) to map each device and LBA - and to keep track of the data written to storage devices. The primary advantage of block storage is the speed at which data can be stored and retrieved using this “raw” device interface. The disadvantage is that the server is required to maintain the entire map of devices and LBAs.
Object-level storage, like file-level storage, object storage utilizes a high-level storage architecture. Unlike file storage, however, object storage does not rely on a file system hierarchy. Instead, object storage uses unique user ID’s (UUIDs) contained in a flat namespace database that spans all storage devices in the object store, regardless of device type or location. The storage devices can be contained within a single location, but more likely are dispersed across many data centers with geographic separation. Applications communicate directly with object storage devices using a high-level programming language, such as Curl, as shown in the following example:
When the object storage system receives the above “PUT” command, it stores the object using one or more Unique User ID’s (UUID). The UUID information is all the application needs to know in order to retrieve the desired data. By replacing LBAs with UUIDs, object storage uses a direct-access scheme similar to block-level storage, but without any mapping overhead imposed on the server.
The popularity of object storage lies in the fact that it takes the best from file and block storage, while enabling new capabilities not available in any prior storage architecture. These features include things such as application-programmable metadata, a namespace that can span multiple instances of physical hardware, and built-in data management functions such as data replication and data distribution at object-level granularity.
So, why do people care about the new features of object storage? Two primary reasons:
NetApp StorageGRID Webscale
NetApp has been a leading object storage vendor for many years. NetApp StorageGRID Webscale, now in its 10th release, is an object storage solution for large archives, media repositories, and web data stores. It’s designed for the hybrid cloud and supports standard protocols, including Amazon S3 and SNIA’s CDMI—so that object applications can be run either on premises or in the cloud.
The StorageGRID Webscale policy engine provides automated data placement according to site-based performance and availability requirements, optimized for cost as data ages. Real-time auditing provides continuous and active monitoring for SLA verification and reporting. The StorageGRID Webscale data durability framework ensures data integrity and accessibility.
StorageGRID Webscale is relied upon to store and manage large-scale, distributed repositories of images, video, and records—around the clock and across the globe.