由于 PSU 故障,节点在系统健康警报中报告 CriticalPSUFruOffAlert
适用场景
- ONTAP 9
- FAS/AFA系统
问题描述
- 节点在运行状况警报中报告
CriticalPSUFruOffAlert
:
ClusterA::> system health alert show -instance
Node: Node-01
Monitor: chassis
Alert ID: CriticalPSUFruOffAlert
Alerting Resource: xxxxxxx
Subsystem: Environment
Indication Time: Fri May 20 02:08:21 2022
Severity: Critical
Probable Cause: Loss_of_redundancy
Description: PSU1 is off. The nodes in this chassis are Node-01, Node-02.
Corrective Actions: 1. Check PSU1 and switch it on.
2. Refer to the Hardware specification guide for more information on the position of the power supply unit (PSU) and ways to check or replace it.
3. Contact support personnel if the alert persists.
- 事件日志中会报告以下事件:
[Node-01: env_mgr: monitor.chassisPowerSupply.off:notice]: Chassis power supply 1 off.
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Temperature is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Current is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan1 Speed is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan1 Fault is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan2 Speed is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan2 Fault is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Pwr Out OK is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fault is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Over Temp is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Over Volt is Unreadable
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Over Curr is Unreadable
[Node-01: power_low_monitor: monitor.chassisPower.degraded:alert]: Chassis power is degraded: Power Supply Status Critical: PSU1.
[Node-01: power_low_monitor: callhome.chassis.power:error]: Call home for CHASSIS POWER DEGRADED: Power Supply Status Critical: PSU1.
- SP 将 PSU 传感器报告为
na
在system sensors
输出:
SP Node-01> system sensors
Sensor Name | Current | Unit | Status | LCR | LNC | UNC | UCR
-----------------+------------+------------+------------+-----------+-----------+-----------+-----------
PSU1_Present | 0x0 | discrete | Present | na | na | na | na
PSU1_Temp | na | degrees C | na | 0.000 | 5.000 | 50.000 | 60.000
PSU1_Curr | na | Amps | na | na | na | na | na
PSU1_Fan1_Speed | na | RPM | na | 4500.000 | 4600.000 | na | na
PSU1_Fan1_Fault | na | discrete | na | na | na | na | na
PSU1_Fan2_Speed | na | RPM | na | 4500.000 | 4600.000 | na | na
PSU1_Fan2_Fault | na | discrete | na | na | na | na | na
PSU1_Status_OK | na | discrete | na | na | na | na | na
PSU1_Pwr_In_OK | 0x0 | discrete | Deasserted | na | na | na | na
PSU1_Pwr_Out_OK | na | discrete | na | na | na | na | na
PSU1_Fault | na | discrete | na | na | na | na | na
PSU1_Input_Type | na | discrete | na | na | na | na | na
PSU1_Over_Temp | na | discrete | na | na | na | na | na
PSU1_Over_Volt | na | discrete | na | na | na | na | na
PSU1_Over_Curr | na | discrete | na | na | na | na | na
- 即使重新安装受影响的 PSU 后,问题仍然存在。