系统中报告PSFutruFanBadAlert
适用场景
- ONTAP 9
- AFF A700
- 服务处理器 (SP)
问题描述
- PSU报告为已降级、并且两个节点上均显示以下事件:
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Fan is Critical Low (0 RPM)
[Node-01: env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 InPower is Warning High (968 W)
[Node-01: cphmd: hm.alert.raised:alert]: Alert Id = PSUFruFanBadAlert , Alerting Resource = XXXXXXXXXXXXXX raised by monitor chassis
[Node-01: env_mgr: callhome.chassis.ps.degraded:error]: Call home for CHASSIS POWER SUPPLY DEGRADED: PS 1
[Node-01: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process cphm: PSUFruFanBadAlert[XXXXXXXXXXXXXX].
system health alert show
命令的输出显示以下内容:
Cluster1::> system health alert show
Node: Node-01
Alert ID: PSUFruFanBadAlert
Resource: XXXXXXXXXXXXXX
Severity: Major
Indication Time: Tue Feb 28 20:16:33 2023
Suppress: false
Acknowledge: false
Probable Cause: Power Supply Unit PSU1 FRU has a major fan problem.
The nodes in this chassis are Node-01.
Possible Effect: The power supply unit (PSU) might stop functioning if
the temperature increases.
Corrective Actions: 1. Check PSU1 FRU and the fans associated with it.
2. Refer to the Hardware specification guide for more information on the
position of the power supply unit (PSU)
and ways to check or replace it.
3. Contact support personnel if the alert persists.
- 受影响PSU的传感器会在
SP-LATEST-IPMI
AutoSupport的部分中显示严重状态:
Sensor Reading:
PSU1_VIN | 0.000 | Volts | cr | na | 90.000 | 94.000 | 260.000 | 264.000 | na
PSU1_IIN | 0.000 | Amps | cr | na | 0.000 | 0.000 | 14.960 | 16.000 | na
PSU1_PIN | 0.000 | Watts | cr | na | 4.000 | 4.000 | 960.000 | 1020.000 | na
PSU1_FAN | 0.000 | RPM | cr | na | 768.000 | 1248.000 | na | na | na
- 上述警报中报告的PSU在
PLATFORM-SENSORS.XML
AutoSupport日志的部分中标记为错误。 - 即使更换了受影响的PSU、也会显示警报。