Object Storage - Page 4

Chun_Chiang

In a StorageGRID deployment with only one Primary Admin Node, where is the Grid Manager local user database (including the built-in root account) stored? If the Primary Admin Node is completely lost and only the Recovery Package remains, how is the root account recovered? Thanks. Chun Chiang

Novicer · ‎2026-06-10

This isn't more of a question any more but my experiences with having high metadata consumption on some of the nodes in a multi country stretched NetApp Storagegrid installation using just SG5812 appliances that come with 64GB RAM of which I believe 61GB is actual usable. This means the metadata limit maxs at 3TB per node in total. I am supposed to use only 1.32TB of that capacity according to NetApp recommendations. Now if there is an imbalance in the total number of nodes in these two countries, like if one country has 6 nodes across 3 sites and the other country has 3 nodes per site in a 3 site solution and the grid is stretched, then you are bound to see high metadata usage is bound to grow in the sites with just 3 nodes. There is a good documentation that i found as part of my digging to understand how things work on the storagegrid cassandra level. This is my limitation since I wasn't fully versed in this part. (https://www.netapp.com/video/z0pro187-d8/best-practices-and-advice-for-designing-a-storagegrid-deployment-1351-2/#:~:text=This means that the grid,storage nodes worth of metadata) So I added virtual metadata-only nodes and as part of the grid expansion the metadata got distributed over to the new nodes, but the compaction jobs didn't take care of reducing the unwanted content on the metadata layer on the storagegrid appliance nodes in all the 3 node sites. The compaction jobs kept looping over and over and I triggered some manually on nodes that had none running on them. But none of this helped. So, I kept reading into generic cassandra documentation and also storagegrid advanced training documentation where there was a mention of the "nodetool cleanup" command. In this case, the nodes were unable to cleanup what was supposed to be cleaned up even after rebalance. I chose to go down this path after running the "nodetool status" command and found how there was an extreme imbalance in the Load value on the nodes where clean up was supposed to happen even though the ownership is around 70% odd which it was supposed to be. So, the thought was this is definitely things that needed clean up. So I ran the nodetool cleanup one by one and this brought down the metadata consumption on the existing physical appliances in the 3 node site after addition of metadata. This according to support was not a recommended operation but these appliances seem to be so under powered and CPU busy most of the time and with little RAM not giving the possibility for increasing the metadata capacity. I have noticed that most of the CPU usage is high and it happens due to cassandra read operations and they seem to contribute to some network retransmissions too. Another thing i noticed is the kind of data being stored on these appliances are important. If they are very large small objects that create a large load of metadata that will make you reach this problem earlier than you should. So the solution is to try and keep all of your sites at the same or close by node count so the metadata distribution can happen evenly and catch high metadata consumers and see if they can be reconfigured to a better high write size. I write this message here since i had a lot of trouble navigating the issue and not article was found help me navigate this situation.

Gandhi · ‎2026-05-26

I recently installed and configured 2 storageGRID in 2 distant sites and in the same network. I federated the 2 grid and the connection is working in both ways but when I put an object in the local bucket from the same tenant it does not replicate to the the same bucket name in the remote site. In both sites I have the same error : Cross-grid replication requests are pending because a resource is unavailable. Failed to send cross-grid replication request from source bucket 'pilot' to destination bucket 'pilot' . Error code: DestinationRequestError. Detail: InternalError. Can someone comment in this.

FelixZhou · ‎2026-04-29

Hello there, we have production Ontap clusters configured with storageGRID for moving old CIFS data and snapshots. they are filling up with 99% usage now. we have also used some space from StorageGRID for backup software as object storage for long term copies, but lost connections lately believe it may also be related to the usage. wondering if we have to take any actions in this situation such as pasusing new data to storagegrid, free up some space or any others.... please share you experience. thank you,

bnies · ‎2026-03-24

Hi, We have an old Netapp FAS2650 running with ONTAP 9.11.1P20 (yes, it is out of support) and a S3 object server running there for testing. It was running fine for couple of weeks until I noticed today that the LIF is not at home node. After reverting the LIF to home node, S3 service stopped working. Nothing is listening on port 443 anymore. I tried to disable/enable object-store-server, stop/start vserver, migrate LIF back to other node, adjust service-policy, all with no luck. S3 object-store-server mycluster::> object-store-server show (vserver object-store-server show) Vserver: mycluster-vs1 Object Store Server Name: mycluster-vs1.example.com Administrative State: up HTTP Enabled: false Listener Port For HTTP: 80 HTTPS Enabled: true Secure Listener Port For HTTPS: 443 Certificate for HTTPS Connections: mycluster-vs1.example.com Comment: Netapp S3 playground Interfaces and service-policy mycluster::> network interface show -vserver mycluster-vs1 Logical Status Network Current Current Is Vserver Interface Admin/Oper Address/Mask Node Port Home ----------- ---------- ---------- ------------------ ------------- ------- ---- mycluster-vs1 vs1data1 up/up 10.0.77.88/24 mycluster-01 a0a-123 true vs1data2 up/up 10.0.88.99/24 mycluster-01 a0a-456 true 2 entries were displayed. mycluster::> network interface show -vserver mycluster-vs1 -lif * -fields service-policy,services vserver lif service-policy services ---------- -------- ------------------ ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- mycluster-vs1 vs1data1 default-data-files data-core,data-nfs,data-cifs,data-flexcache,data-fpolicy-client,management-dns-client,management-ad-client,management-ldap-client,management-nis-client,data-dns-server mycluster-vs1 vs1data2 my-s3-service data-core,data-s3-server 2 entries were displayed. No TCP/s3 service listening on LIF vs1data2 port 443. mycluster::> network connections listening show -vserver mycluster-vs1 Vserver Name Interface Name:Local Port Protocol/Service ---------------- ------------------------------------- ----------------------- Node: mycluster-05 mycluster-vs1 vs1data1:4049 UDP/rquota mycluster-vs1 vs1data1:2050 TCP/fcache mycluster-vs1 vs1data1:111 TCP/port-map mycluster-vs1 vs1data1:111 UDP/port-map mycluster-vs1 vs1data1:4046 TCP/sm mycluster-vs1 vs1data1:4046 UDP/sm mycluster-vs1 vs1data1:4045 TCP/nlm-v4 mycluster-vs1 vs1data1:4045 UDP/nlm-v4 mycluster-vs1 vs1data1:2049 TCP/nfs mycluster-vs1 vs1data1:2049 UDP/nfs mycluster-vs1 vs1data1:635 TCP/mount mycluster-vs1 vs1data1:635 UDP/mount 12 entries were displayed. I presume it's a bug in this old ONTAP code. Any idea how to revive that without destroying the buckets and users?

Board Activity

StorageGRID single admin node lost. Where is the Grid Manager local user stored including built-in root access ? access

Metadata Filling up on SG5812 appliances

No replication between my 2 storageGRID

storageGRID filled up 99%, what actions we should take?

S3 Server stopped working after LIF failover