L2 watchdog NMI导致A900上的系统重新启动
适用场景
A900
问题描述
- L2 watchdog NMI导致系统重新启动
Record 1068: Fri Jan 27 17:32:12.049970 2023 [IPMI.notice]: 0353 | 02 | EVT: ef02ffff | PCM_Status | Deassertion Event, "Fault"
Record 1069: Mon Jan 30 09:29:13.661977 2023 [IPMI.notice]: 0354 | 02 | EVT: 6fc824ff | System_Watchdog | Assertion Event, "Timer interrupt"
Record 1070: Mon Jan 30 09:29:14.091221 2023 [IPMI Event.critical]: NMI
Record 1071: Mon Jan 30 09:29:14.092453 2023 [IPMI.notice]: 0355 | 02 | EVT: 6f00ffff | CriticalInt | Assertion Event, "NMI/Diag Interrupt"
Record 1072: Mon Jan 30 09:29:14.853157 2023 [IPMI.notice]: 0356 | 02 | EVT: 6fc124ff | System_Watchdog | Assertion Event, "Hard reset"
Record 1073: Mon Jan 30 09:29:15.064376 2023 [IPMI Event.critical]: L2 watchdog timeout hard reset
Record 1074: Mon Jan 30 09:29:15.093415 2023 [IPMI Event.critical]: System reset
Record 1075: Mon Jan 30 09:29:15.095110 2023 [IPMI Event.critical]: L2 watchdog action completed
Record 1076: Mon Jan 30 09:29:15.094438 2023 [IPMI.notice]: 0357 | 02 | EVT: 0301ffff | SysReset | Assertion Event, "State Asserted"
Record 1077: Mon Jan 30 09:29:15.094843 2023 [IPMI.notice]: L2 to L1 is 1(s) 488(us)
- 无法启动系统。启动期间会显示以下错误
BIOS Version: 18.6
Portions Copyright (C) 2014-2022 NetApp, Inc. All Rights Reserved.
Initializing System Memory ...
BIOS Version: 18.5
Portions Copyright (C) 2014-2022 NetApp, Inc. All Rights Reserved.
Note BIOS switches. Then
*****************************************
*** Booted from Backup Firmware Image ***
*****************************************
ACPI RSDP Found at 0x6f7fe014
Abort AUTOBOOT
LOADER-B>
CPU Exception:Machine-Check Exception (#MC)
(DXE) CPU Context (X64):
RIP - 000000004FCAA735, CS - 0000000000000038, RFLAGS - 0000000000000206
RAX - 000000004FCAA760, RCX - 00000000000003FD, RDX - 00000000000003FD
RBX - 0000000045709018, RSP - 0000000048722DE8, RBP - 0000000000000000
RSI - 0000000000000046, RDI - 00000000000003FD
R8 - 0000000005B6F32C, R9 - 00000000435BE018, R10 - 0000000000000020
R11 - 00000000440D7018, R12 - 000000004FCAF85E, R13 - 000000004FCAF85E
R14 - 0000000048723040, R15 - 000000004FC4D697
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
GDTR - 000000005EF25C98 0000000000000047, LDTR - 0000000000000000
IDTR - 0000000046144018 0000000000000FFF, TR - 0000000000000000
CR0 - 0000000080000013, CR2 - 0000000000000000, CR3 - 0000000048401000
CR4 - 0000000000000668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
Dumping Machine Check Registers:
CPU APIC ID 0x0
IA32_MC9_CTL 0x0000000FFFFFFFFF
IA32_MC9_STATUS 0xBE200000000C110A
IA32_MC9_ADDR 0x00000000000003C0
IA32_MC9_MISC 0x003B8F0000480086