BlueXP Services
BlueXP Services
I was moving large folders (several GBs) from one top level folder (served by steelstore CIFS) to another using windows file explorer when the copy failed halfway through. At that point the entire steelstore appliance stopped responding, including the web console interface (displayed a basic steelstore page indicating the web server was up but no console running behind it, so not quite a 404 error).
I waited 24 hours and the web console eventually returned on it's own accord, however the alarms triggered were:
The first hurdle (after importing the config from the old console and attaching the required volumes to the new instance except /dev/sda1 and /dev/sdk) was when I executed the "megastore guid reset" line in the doco....
I received the warning/error: "Deleting megastore.guid in cloud bucket returned 110" for which I can find no information about online.
I pushed forward and ran "service enable" and after many hours of waiting for "Starting optimization service..." I got a "Storage Optimization Service: initialization error"
The current console shows the same alarm and service states:
Alarms Triggered:
Hi,
I'm sorry to hear you're having problems. While I don't know the cause of the original problem (the SteelStore going down during the copy operation), I can assist at this time with the proper move of the appliance.
The first question I need to clarify however is whether you're using a virtual SteelStore (AltaVault) appliance, or a cloud-based SteelStore (AltaVault) appliance? The information below suggests that you're using a virtual appliance (i.e. VMware), but the manual you pointed out is the cloud-based appliance manual. The virtual appliance install guide is actually this one: https://library.netapp.com/ecm/ecm_download_file/ECMP12455064. Assuming you're on a virtual appliance and attempting to upgrade from SteelStore 3.x to AltaVault 4.0, then you would follow the directions listed in the appendix A.
To provide additional clarification around steps 5-8.
5. Deploy a new 4.0.0 AltaVault virtual appliance. Do not go through the step of adding a second disk (you'll reattach the original SteelStore one to this AltaVault instance).
5a. Power on the appliance, and run through the CLI wizard to IP the appliance management interface.
5b. Connect to the appliance GUI.
6.Import a shared only configuration on a 4.0 VM:
a. Go to the UI and choose Settings > Setup Wizard.
b. Select Import Configuration.
c. Click on the option, Import Shared Data Only, while specifying the configuration file to import.
d. After the import completes, do not restart the service! Connect back to the appliance CLI.
7. Reset the Megastore GUID on the 4.0 AltaVault virtual appliance by using the CLI command:
CLI > megastore guid reset
8. Associate the 3.x datastore disks from vCenter to the 4.0 VM:
a. Navigate to the VM in the vCenter UI.
b. Right-click on the VM and choose Edit Settings.
c. Click on the tab, Virtual Hardware.
d. Select New device > Existin
g Hard Disk, and click Add.
e. Select the path of the disk file that you noted in the previous Step 4e.
f. Click OK.
If you need more information about doing DR (which would be different from attempting to migrate the datastore disk from the 3.x to AltaVault 4.0), you can actually refer to the Deployment Guide chapter on DR. That guide is available here: https://library.netapp.com/ecm/ecm_download_file/ECMP12434738
Let us know if the above helps.
Regards,
Christopher
Hi Chris,
Thank you for your reply. The original system that failed was actually an Amazon AWS steelstore virtual appliance backed by an S3 bucket. The bucket is still available and I used these instructions - sorry I included the wrong link in my original post!
All other information I've given is accurate. After following all the steps in the chapter "Launching and Configuring the New AltaVault AMI Instance Upgrade" it still shows:
Alarms Triggered:
Hi Jason,
OK, sorry for the confusion. Most of the information in my last update above won't apply, so disregard it.
To reiterate: You've got the cloud-based AMI AltaVault instance, you tried to use the instructions from "Ch 2. Upgrade Process for AltaVault AMI", and after the steps were taken you failed to start the service successfully. You noted that the megastore GUID reset didn't work correctly - this could potentially be one of the problems. The megastore.GUID file is used to track ownership of the AltaVault that owns the cloud bucket. In your case, we issue a reset to say this new (AltaVault) appliance will now be the owner of the bucket, rather than the previous instance which you powered down during these instructions.
I'm not clear why the megastore GUID reset didn't work, but I think we need to resolve that error first, which will probably help things along. Assuming that the migration steps were handled correctly, let's do the following:
1. Reboot the appliance from the CLI:
enable
conf t
no service enable
reload
2. When the appliance comes back up, reconnect to it via CLI and issue:
enable
conf t
no service enable (it should indicate the service is still disabled)
meagstore guid reset
service enable
3. If the service fails to startup at that point, then let's copy and email a system log from the GUI (Settings > System Log) and send that to me (christopher.wong@netapp.com) to review. Make sure it captures messages from the megastore guid reset forward - note that you may see this over more than one page, depending on where the page breaks occur. Errors are colored red.
Thanks,
Christopher
Hi,
Sorry, I don't know why I called you Jason (must've misread it from somewhere else), just realized that right now! 🙂 Anyways, I wanted to provide you another technical diagnosis step. From the CLI, can you issue:
enable
conf t
cloudctl exec "-a list"
and provide the output that appears? If it is successful it will connect to the cloud provider (I'm assuming Amazon) and provide you a listing of the buckets.
If it doesn't appear and you get an error, this could indicate a problem with credentials, or the IAM of the user who's credentials you've applied to the appliance. IAM security requirements are listed in the appendix of the AltaVault/SteelStore user guide. For example: https://library.netapp.com/ecm/ecm_download_file/ECMP12031271
Thanks,
Christopher
Hi,
Checking in to see if you were able to get the AltaVault appliance upgrade done to resolve the error?
Regards,
Christopher
Hi Chris,
Thanks for the follow up. I ran your suggested command and left it to run overnight (it took a while) but it seems it's timed-out.
steelstore01 (config) # cloudctl exec "-a list"
Failed to get bucket list: 7: Couldn't connect to server : Connection timed out
I was very careful to disconnect and connect the appropriate drives as per the upgrade documentation (i.e. moving sdb,sdc,sdd,sde,sdf,sdg,sdh,sdi across from the old steelstore and leaving the existing /dev/sda1 and sdk).
Any ideas? Thank you.
Hi,
Thanks for the information. Did you also try the megastore GUID reset as well, or just the cloudctl command? The cloudctl usually takes only a little time to run (like ~5-10 seconds tops), so this is unusual. The timeout suggests that the configuration is unable to connect to the cloud storage target - possibly due to the configuration not being set. You can double check this by going back the GUI and selecting Storage > Cloud Settings and verifying/re-applying the cloud credentials to see if it can connect. Note that this suggestion is probably also going to mean you need to perform the other response about resetting the megastore guid:
1. Reboot the appliance from the CLI:
enable
conf t
no service enable
reload
2. When the appliance comes back up, reconnect to it via CLI and issue:
enable
conf t
no service enable (it should indicate the service is still disabled)
meagstore guid reset
service enable
3. If the service fails to startup at that point, then let's copy and email a system log from the GUI (Settings > System Log) and send that to me (christopher.wong@netapp.com) to review. Make sure it captures messages from the megastore guid reset forward - note that you may see this over more than one page, depending on where the page breaks occur. Errors are colored red.
Regards,
Christopher
I am having the same problem on a 3030 appliance. I am getting a storage optimization service not ready - initialization error, but I am also still working through some IAM list permission issues on the S3 side I have to resolve with a 3rd Party. Will the optimization service remain in the not-ready state until I successfully connect to AWS?
Hi,
Yes that is correct - you will not have any capability to have the service enter the healthy state until you can properly connect and communicate with a cloud storage target. Note that while the service initialization error is the same, the underlying cause is significantly different than that discussed higher up in this thread (which pertains to the cloud-based AltaVault AMI appliance).
Regards,
Christopher
Hi Chris,
The AVA 4.2 for HyperV has the initialization error. Please advise
I email the log and screen captures to you already.
Many thanks.
Best Regards,
Tony