接管和崩溃消息为"panic:在CPU6发生不可更正的计算机检查错误"
适用场景
- ONTAP 9
问题描述
节点崩溃、并显示以下消息
示例:
PANIC : Uncorrectable Machine Check Error at CPU6. BDWL_DCU Error: STATUS<0xbf80000000000114>(Val,UnCor,Enable,MiscV,AddrV,PCC,ExtErr(0),ErrCode(Read,Data,L0))ADDR<0x000000006a5c09c0>((0x6a5c09c0))MISC<0x0000000000000086>((0x86)).
version: 9.7P6: Tue Jul 28 00:08:29 EDT 2020
conf : x86_64.optimize
cpuid = 6
KDB: stack backtrace:
vpanic() at vpanic+0x70d/frame 0xfffffe02590da790
panic() at panic+0x43/frame 0xfffffe02590da7f0
ntap_gdb_on_module() at ntap_gdb_on_module+0x870/frame 0xfffffe02590daa80
mca_trap_handler() at mca_trap_handler+0x612/frame 0xfffffe02590daf00
mca_intr() at mca_intr+0x34/frame 0xfffffe02590daf20
Xmchk() at Xmchk+0x11f/frame 0xfffffe0658cffa20
recursive PANIC: page fault (supervisor read data, page not present) on VA 0xfffffe0658cffa28 cs:rip 0x20:0xffffffff809f8600 rflags 0x10046
KDB: stack backtrace:
vpanic() at vpanic+0x6ce/frame 0xfffffe02590da270
panic() at panic+0x43/frame 0xfffffe02590da2d0
amd64_syscall() at amd64_syscall+0x112e/frame 0xfffffe02590da340
trap() at trap+0xb03/frame 0xfffffe02590da3b0
trap() at trap+0x1b5/frame 0xfffffe02590da4c0
calltrap() at calltrap+0x8/frame 0xfffffe02590da4c0
--- trap 0xc, rip = 0xffffffff809f8600, rsp = 0xfffffe02590da590, rbp = 0xfffffe02590da620 ---
db_read_bytes() at db_read_bytes+0x80/frame 0xfffffe02590da620
db_get_value() at db_get_value+0x33/frame 0xfffffe02590da660
db_backtrace() at db_backtrace+0x5e8/frame 0xfffffe02590da710
vpanic() at vpanic+0x70d/frame 0xfffffe02590da790
panic() at panic+0x43/frame 0xfffffe02590da7f0
ntap_gdb_on_module() at ntap_gdb_on_module+0x870/frame 0xfffffe02590daa80
mca_trap_handler() at mca_trap_handler+0x612/frame 0xfffffe02590daf00
mca_intr() at mca_intr+0x34/frame 0xfffffe02590daf20
Xmchk() at Xmchk+0x11f/frame 0xfffffe0658cffa20
multiple recursive panics - rebooting.
cpuid = 6
Uptime: 162d4h55m30s
ahcich0: AHCI reset done: devices=00000001
System rebooting...
Requesting SP to power cycle the filer to attempt to clear DRAM UECC
================ Log #2 end time Thu Feb 4 12:35:40 2021
================ Log #3 start time Thu Feb 4 12:35:42 2021
****************************
* Power Fail - Please Wait *
****************************
CPU cores enabled = 1
DIMM speed = 2133.33 MHz
SATA (AHCI) Device: ATP SATA III mSATA AF120GSMHI-NT3
Boot Loader version 6.0.8
Copyright (C) 2000-2003 Broadcom Corporation.
Portions Copyright (C) 2002-2018 NetApp, Inc. All Rights Reserved.
***********************************
* Pending cached data is detected *
* Starting the process to save to *
* media... *
***********************************
Destage is in progress, please wait
..........
================ Log #3 end time Thu Feb 4 12:35:57 2021