System Health is a NetMRI feature to provide a view of the system health of the NetMRI appliance. NetMRI provides two visual inputs to notify and assist the administrator in responding to issues in the NetMRI appliance:
...
System Health features for the NetMRI Operations Center environment will list all issues associated with the Operations Center appliance and for all of its associated Collectors.
All reported issues are the same for all alerts described in the previous topics; the main difference is that the System Health feature applies globally to all appliances and virtual appliances within the distributed Operations Center environment.
Note | ||
---|---|---|
|
...
Every message listed in the System Health page provides an Alert Code, similar to the following:
If you need to communicate with Customer Support for an issue, ensure that you provide this code to the support representative. |
In this section, you will find descriptions for all alerts in the System Health category, descriptions of possible causes for the issue, and potential fixes for each alert.
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
System Health health alerts provide the following standard color-coding in the System Health page under NetMRI Settings:
- Green: indicates no issues currently present in the category.
- Yellow:Warning. Warning health alerts appear when an issues issue appears that poses potential for more severe problems in the future, or a configuration issue that should be addressed; for example, a disk utilization level of 70% in a NetMRI appliance, Operations Center, or a Collector in an Operations Center network will raise a Warning alert, as will a detected VRF network that is not yet mapped to a network view.
- Red: Critical. An issue that needs to be addressed as soon as possible. Critical alerts occur in cases where, for example, storage utilization is at 90% or higher, or a system fan fails or is removed from the appliance.
- Grey: Offline. Alerts colored Grey appear only for Operations Center Collectors that are offline due to expected causes, such as a Collector being taken offline for replacement or changes to the configuration.
Banner System Health system health messages appear only in yellow (Warningwarning) and red (Criticalcritical). Click directly on the banner text to To display the System Health page with its alert listings, click the banner text.
You may disallow the System Health banners from appearing to non-Admin NetMRI users, by opening the Settings – > General Settings –> Advanced Settings page and choosing the Hide the system banners from non-admin users setting. (It is on the last page of Advanced Settings, under User Administration.) Click the Action icon and choose Edit, choose Yes and click OKTo hide system health banners for NetMRI users:
- In the NetMRI UI, go to the Settings icon > User Admin > Roles.
- Click the Action icon for the role you want to edit and then select Edit.
- Click Privileges.
- In the View: System Health Banner row, click the Delete icon.
- Click Yes.
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
...
...
Platform Capacity alerts do not necessarily reflect a problem in the NetMRI system. Each NetMRI appliance has an advisory limit in the number of discovered interfaces, discovered devices and discovered end host devices that it is expected to support, based on disk space and system processing capabilities inherent in the appliance model. These values are called the Platform Capacity and are also reflected in the NetMRI Configuration values shown under on the Settings icon –> > Setup –> > Settings Summary page.
Unlike other System Alert categories, Platform Capacity warnings will always appear when all three of the advisory system limits (Number of managed interfaces, Number of end hosts devices, number of discovered devices) are exceeded by the appliance. Note that the Processing category (also see the Details on Processing Alerts topic) provides the same three warnings (along with others) in its alerts category. When any of these three limits is violated as the result of a processing issue, one of the Platform Capacity warnings also will appear in the notification. These limits are not enforced and the NetMRI appliance operates normally; excess devices continues to appear in the Discovered Devices table. (For related information, see Understanding Platform Limits, Licensing Limits and Effective Limits.)
...
Double-clicking on any hardware alert opens the alert in on the Settings –> icon > Notifications –> > Hardware Status page
Alert Message | User Action |
---|---|
RAID Drive <X> Failed | Replace the hard disk with a replacement drive authorized by Infoblox. |
RAID Array Failed | Contact Customer Support. |
Fax <X> Failed | Replace the system fan. Appears only in systems where system fans are user-replaceable, as with the NetMRI NT-2200 and NT-4000 devices. Fan assemblies must be replaced with authorized Infoblox parts. Contact Customer Support if this message appears in systems where fans are not user-replaceable. |
Power Supply <X> Failed | Check Power Supply operation. Message appears only for systems in which a redundant 1+1 power supply configuration is available and running in the device in question. (For a single-power-supply system, the appliance simply shuts down.) The alerts also allow for the possibility that a power supply is unplugged. |
Ambient temperature is high. Internal temperature is high. | Both messages may appear for the same system, with the internal temperature being affected by the ambient temperature. Reduce the ambient temperature where possible; if the Internal temperature remains high, look for a Fan Failed error message along with the Internal Temperature message. Contact Customer Support if an Internal Temperature is High issue persists when conditions are otherwise optimal. |
Critical — RAID Battery failed. | Contact Customer Support. |
RAID Array Degraded. | The RAID array is not fully operational due to a disk in the process of rebuilding or a disk being removed. If a disk has been removed in preparation for replacement, this issue will also appear, and will clear when the replacement is finished rebuilding. If you know that no disk replacement operation has been started with the appliance and this issue appears, contact Customer Support. |
...
Alert Message | User Action |
---|---|
System Processing capacity is being exceeded. | A number of causes may contribute to processing slowdowns on the appliance.
|
...