重复报告CPU0_Error "State assered"和"State deastered"
适用场景
- FAS8200
- AFF A300
问题描述
- CPU0_Error
"State Asserted"
、"State Deasserted"
并从服务处理器(SP)重复报告。events all
受影响节点上的警示LED可能会亮起。
Record 1532: Fri Mar 26 03:45:32 2021 [IPMI.notice]: a403 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
Record 1533: Fri Mar 26 03:45:39 2021 [IPMI.notice]: a503 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
Record 1534: Fri Mar 26 03:47:32 2021 [IPMI.notice]: a603 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
Record 1535: Fri Mar 26 03:47:39 2021 [IPMI.notice]: a703 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
Record 1536: Fri Mar 26 03:47:46 2021 [IPMI.notice]: a803 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
Record 1537: Fri Mar 26 03:47:53 2021 [IPMI.notice]: a903 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
Record 1538: Fri Mar 26 03:48:07 2021 [IPMI.notice]: aa03 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
Record 1539: Fri Mar 26 03:48:14 2021 [IPMI.notice]: ab03 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
Record 1540: Fri Mar 26 03:48:28 2021 [IPMI.notice]: ac03 | 02 | EVT: 0300ffff | CPU0_Error | Assertion Event, "State Deasserted"
Record 1541: Fri Mar 26 03:48:42 2021 [IPMI.notice]: ad03 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
- SP重新启动后、问题描述 仍会保留。
Record 1695: Thu Jan 1 00:00:41 1970 [IPMI.notice]: bf03 | c0 | OEM: ffff70005000 | ManufId: 150300 | SP Reset Externally
Record 1696: Thu Jan 1 00:00:41 1970 [IPMI.notice]: c003 | c0 | OEM: fcff70000000 | ManufId: 150300 | POS Register: Unexpected Reset
Record 1697: Thu Jan 1 00:00:50 1970 [IPMI.notice]: c103 | 02 | EVT: 0301ffff | CPU0_Error | Assertion Event, "State Asserted"
Record 1698: Thu Jan 1 00:00:55 1970 [IPMI.notice]: c203 | 02 | EVT: 0300ffff | Fan_Override | Assertion Event, "State Deasserted"
system sensors show !normal
在输出中、CPU Error
fault
处于状态。
Cluster::> sensors show -state !normal
(system node environment sensors show)
Node Sensor
State Value/Units Crit-Low Warn-Low Warn-Hi Crit-Hi
---- --------------------- ------ ----------- -------- --------
------- -------
node-1
CPU0 Error fault
ERROR
- 从SP
system sensors
CPU0状态为InAsserted
状态。
Sensor Name | Current | Unit | Status | LCR | LNC | UNC | UCR
-----------------+------------+------------+------------+-----------+-----------+-----------+-----------
CPU0_Temp_Margin | -60.000 | degrees C | ok | na | na | -10.000 | 0.000
In_Flow_Temp | 21.000 | degrees C | ok | 0.000 | 5.000 | 50.000 | 55.000
Out_Flow_Temp | 34.000 | degrees C | ok | 0.000 | 5.000 | 65.000 | 75.000
PCI_Slot_Temp | 30.000 | degrees C | ok | 0.000 | 5.000 | 60.000 | 70.000
Smart_Bat_Temp | 28.000 | degrees C | ok | 0.000 | 5.000 | 60.000 | 70.000
CPU0_Error | 0x0 | discrete | Asserted | na | na | na | na