报告了多个环境传感器读取问题、节点未启动
适用场景
问题描述
- 新部署的控制器开机、并出现多个错误、即传感器对风扇的读数
callhome.c.fan.fru.fault: Call home for CHASSIS FAN FRU FAILED: Fan2_1
monitor.globalStatus.critical: Multiple fans has failed: SysFan4 F2, SysFan3 F1, SysFan2 F2, SysFan2 F1, SysFan1 F2, SysFan1 F1.
- 重新启动并重新拔插主板不能解决问题描述问题。
- 更换主板后、遇到其他环境传感器错误:
- 观察到NVBattery错误
WARNING: The battery is experiencing a critical failure:
- Internal error. Failed to communicate with the Environment Manager
Without a working battery, the system cannot retain data
during a power outage, which can result in data loss.
Power down the system and verify that the battery is
properly installed.
- 在启动序列期间会观察到传感器和i2c总线滞留错误:
[Node01:netif.sfpEventErrorCode:error]: Unsupported or faulty transceiver or cable in port e0h. Error :Bus stuck(I2C or data shorted).
[Node01:netif.sfpEventErrorCode:error]: Unsupported or faulty transceiver or cable in port e0h. Error :Bus stuck(I2C or data shorted).
[Node01:monitor.power.unreadable:error]: A power sensor PVCCIN CPU0 in the controller module is not readable.
[Node01:monitor.power.unreadable:error]: A power sensor PVCCIN CPU1 in the controller module is not readable.
[Node01:monitor.power.unreadable:error]: A power sensor PVDDQ ABC in the controller module is not readable.
[Node01:monitor.power.unreadable:error]: A power sensor PVDDQ DEF in the controller module is not readable.
[Node01:monitor.power.unreadable:error]: A power sensor PVDDQ GHI in the controller module is not readable.
[Node01:monitor.power.unreadable:error]: A power sensor PVDDQ KLM in the controller module is not readable.
- 在BMC升级到最新版本(创建此KB时为13.10P1)后、节点可以启动、但很快又失败。