由于SP挂起、机箱电源显示严重状态
适用场景
- AFF A700
- FAS8200
- chassisPower.degraded:alert
问题描述
EMS
显示机箱内部PSU无法读取、并且SP 同时由于检测信号丢失而重新启动。
[env_mgr: monitor.chassisPowerSupply.degraded:notice]: Chassis power supply 1 is degraded: PSU1 Temperature is Unreadable
[power_low_monitor: monitor.chassisPower.degraded:alert]: Chassis power is degraded: Power Supply Status Critical: PSU1.
[spsm_listener: sp.heartbeat.stopped:error]: Have not received a IPMI heartbeat from the Service Processor (SP) in last 20 seconds.
[spsm_listener: callhome.sp.hbt.missed:notice]: Call home for SP HBT MISSED
[spsm_listener: sp.heartbeat.resumed:info]: Received IPMI heartbeat from the Service Processor (SP).
- 过一会儿可能就会恢复正常。
spmgmt.driver.timeout: The software driver for the Service Processor (SP) detected a problem: Unable to update SP network information at this time.
sp.heartbeat.resumed: Received IPMI heartbeat from the Service Processor (SP).
splog.running.normally: Process splogd is operating normally.
INFORMATIONAL monitor.chassisPowerSupplies.ok: Chassis power supplies OK.
NOTICE monitor.globalStatus.ok: The system's global status is normal.
- 在SP重新启动后
SP-LATEST-IPMI
显示PSU状态已恢复、但多个传感器显示异常状态。
Sensor Name | Current | Unit | Status | LCR | LNC | UNC | UCR
-----------------+------------+------------+------------+-----------+-----------+-----------+-----------
CPU0_Temp_Margin | na | degrees C | na | na | na | -5.000 | 0.000
In_Flow_Temp | -55.000 | degrees C | cr | 0.000 | 10.000 | 75.000 | 80.000
CPU_VCC | 0.010 | Volts | cr | 0.708 | 0.747 | 1.348 | 1.426
CPU_1.05V | 0.010 | Volts | cr | 0.892 | 0.941 | 1.154 | 1.203
CPU_VTT | 0.010 | Volts | cr | 0.931 | 0.989 | 1.213 | 1.261
LM56_Temp | na | degrees C | na | 0.000 | 10.000 | 72.000 | 77.000
CPU_1.5V | 0.010 | Volts | cr | 1.271 | 1.348 | 1.649 | 1.727
Bat_1.5V | 1.756 | Volts | cr | 1.280 | 1.348 | 1.649 | 1.727