跳转到主内容

由于NVRAM卡出现故障且出现"hung_Task "和"hung_Task _timeout secs"、节点脱机

Views:
4
Visibility:
Public
Votes:
0
Category:
element-software<a>2009558150</a>
Specialty:
solidfire
Last Updated:

适用场景

SolidFire AFA:SF19210

问题描述

  • 节点脱机之前 、sf-master.info显示以下内容 

2023-04-29T18:44:47.632229Z SFALPSF08 master-1[26751]: [APP-5] [Leader] 28567 CMIscsiConnectMo serviceshared/LeaderCoordinator.cpp:618:OnClusterMasterConnectCallback|Full vote, based on connection states shouldVote=1 stateVote=1 sequenceNumber=143 nodesWithWorkingEAContainers={57,72,86,126,154,155,185,199}
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@

  • dmesg -T在nvme0n1上显示"hung_tase_timeout sec"
crash> dmesg -T [Sat Apr 29 18:49:04 UTC 2023] INFO: task jbd2/nvme0n1-8:26613 blocked for more than 120 seconds. [Sat Apr 29 18:49:04 UTC 2023] Tainted: G O 4.19.37-solidfire8 #1 [Sat Apr 29 18:49:04 UTC 2023] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  • 在崩溃后生成多个核心转储

  -rw-rw-rw- 1 dexterap engr 76763717096 Apr 29 12:20 dump.202304291845
    -rw-rw-rw- 1 dexterap engr 776107259 Apr 29 12:32 dump.202304291928

  • 核心文件在NVRAM卡"nvme0n1"上显示多个内核崩溃
KERNEL: /sf_debug/12.3.2.3/lib64/modules/4.19.37-solidfire8/vmlinux-ember-x86_64-4.19.37-solidfire8 DUMPFILE: dump.202304291845 [PARTIAL DUMP] CPUS: 56 DATE: Sat Apr 29 18:45:09 UTC 2023 UPTIME: 380 days, 21:16:56 LOAD AVERAGE: 3.68, 3.95, 4.22 TASKS: 3273 NODENAME: QALPOGSF08 RELEASE: 4.19.37-solidfire8 VERSION: #1 SMP Mon Aug 17 14:34:57 UTC 2020 MACHINE: x86_64 (2600 Mhz) MEMORY: 383.9 GB PANIC: "Kernel panic - not syncing: hung_task: blocked tasks" PID: 299 COMMAND: "khungtaskd" TASK: ffff8f9c77b71d80 [THREAD_INFO: ffff8f9c77b71d80] CPU: 22 STATE: TASK_RUNNING (PANIC) [32908851.679379] INFO: task jbd2/nvme0n1-8:26613 blocked for more than 120 seconds. [32908852.259911] Kernel panic - not syncing: hung_task: blocked tasks

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.