I'm demoing Insight Perform right now and it looks great so far. I'm able to get some really good detailed information about the volumes. What I can't seem to figure out is how to run reports, or what reports are available to me.
I looked at the documentation and found this:
From the OnCommand Insight portal, select Reports > Insight Perform Report.
Set these parameters to generate the report.
Report on: select Last hour, Last day, or Last week.
The number of utilized/underutilized Hosts, Arrays and ISLs: enter a number.
Unutilized Volumes Traffic Threshold (MB): enter a number.
Unbalanced ISLs Index Threshold: enter a number.
Click Generate Report.
Use the report tabs in the spreadsheet to examine this utilization information:
Summary shows the start and end timestamp for the measurements taken, the performance threshold you have set for the host and the ISL (Inter-Switch Link) Balance Index.
Top 10 most and least utilized hosts allows you to assess the effectiveness of load balancing in the storage network. Highly utilized storage devices experience near-threshold data traffic, indicating the possible need to relocate applications to less utilized storage arrays. The Most Utilized list also helps determine whether redirecting data traffic going to ports of specific arrays through under-utilized storage ports will alleviate the potential for bottlenecks. A low distribution percentage in the Least Utilized list indicates that the storage device has been used very little over the designated time period and has neither received nor sent much data, making it a good candidate for new storage allocation.
Unutilized Volumes list summarizes the storage resources in the SAN that are not being utilized at all and which are available.
ISL Balance Index is the Inter-Switch Link (ISL) Balance Index, a calculated Standard Deviation, that measures how balanced the load is across switch ports.
Unbalanced Devices is the balance index, a calculated Standard Deviation. It is a measurement of how balanced the load is across switch ports. The further away from 1 that the ISL Balance Index is, the more likely the host load is going to be imbalanced across switches. A high value (above 50) signifies a ratio problem between the host ports; that is, some ports are experiencing a large amount of traffic and others a very low amount, indicating a potential problem. This demonstrates the need to balance data flow across ISLs.
It seems pretty easy but I don't have a Reports button. What am I missing?
well, the documentation refers to the "OnCommand Insight portal". More acurately it should be the "OnCommand Insight reporting portal" which is being shipped with the data warehouse (dwh) and needs to be deployed on a dedicated VM. It needs some very basic configuration - specifying a connector and scheduling ETL (aka. builds), after which you should be able to use the reports. Have a look at the Data Warehouse Administration Guide on the NetApp Support site for more detailed infos.
Yes as far as I know it's the data warhouse failing to collect inventory data from the Insight server.
I've uploaded a copy of the error that appears upon expanding the failed task. I don't seem to have a SANscreen/acq/log directory, I have a SANscreen/log but that doesn't seem to have any recently modified logs in it.
The data source that it's trying to collect from is a Windows 2008 R2 box with Insight running on it. Insight itself is connecting to three Netapp systems, two running Cluster mode and one on 7-mode.
Unfortuantely I didn't install any of this and am relatively new to OnCommand Insight so I'm learning much of this as I go.
I'm not entirely sure, but could it be that you first did a "Build now" on the Schedule page, after which you tried to "Build from history"?
Because that wouldn't work and could produce an error like this. In which case you should reset the database, rebuild the connector, skip "Build now" and go straight for "Build from history".
Else, if this error occurs if you just try to "Build now" or during the build of a regular schedule, my best guess is it's a bug and you should contact your responsible guy who started this POC and have him open a case about it.
These are the relevant lines in the error message in any case:
Caused by: com.netapp.sanscreen.dwh.applications.inventory.InventoryException: Failed to execute SQL script com/netapp/sanscreen/dwh/inventory/processors/chassis/ChassisAndMultipleChassisRelationProcessor.load.oneChassisRelation.sql
Caused by: com.netapp.sanscreen.dwh.util.SqlShellException: Failed to execute: UPDATE dwh_inventory.compute_resource public_chassis, dwh_inventory_staging.chassis_global global_chassis, dwh_inventory_staging.chassis_local local_chassis, dwh_inventory_staging.chassis_local local_ref, dwh_inventory_staging.tmp_compute_resource extracted SET public_chassis.vmId = local_ref.global_id WHERE public_chassis.id = local_chassis.global_id AND local_chassis.local_id = extracted.id AND local_chassis.connector_id = 1 AND local_chassis.type = 'COMPUTE_RESOURCE' AND global_chassis.id = local_chassis.global_id AND global_chassis.connector_id IS NOT NULL AND local_ref.local_id = extracted.vmId AND local_ref.connector_id = 1 AND local_ref.type = 'HV_VIRTUAL_MACHINE'
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot add or update a child row: a foreign key constraint fails (`dwh_inventory`.`compute_resource`, CONSTRAINT `fk_compute_resource_vmId` FOREIGN KEY (`vmId`) REFERENCES `hv_virtual_machine` (`id`) ON DELETE CASCADE)
As you can see, there is a problem with updating foreign keys in the database.
Try to reset the database first and do a new "Build from history". If that works, you can configure a schedule to do the builds regularly on a daily basis.
I reset the database, rebuilt the conector and set a build from history. I made it past inventory and dimensions but failed at datamart prepartations. I've included the error below if you're curious. Otherwise I'm going to open up a Netapp support request.
did you adjust the date range for the "Build from history"? If you haven't please try again and choose a date range where the OCI database contains data about. Remember, the retention time of the OCI database is seven days.
Does this occur in the customer's environment, where data acquisition is ongoing or with a backup in a lab environment?
If his error still persists, I think the best course of action would be to open a case.
Yup I did four days, hoping that it would be both enough time for it to build correctly.
Unfortuantely I don't know another customer who is running this, but I did open a case with Netapp. We did a webex to collect some data and it was escalated up to another engineer. Hopefully they'll be able to figure something.
I do appreciate all of the help you've been able to provide.