跳转到主内容

在多个AFF 和FAS 平台上暂时无法读取或写入启动设备后、系统会中断

适用场景

  • AFF A200 | FAS2620 | FAS2650
  • AFF A220 | AFF C190 | FAS2720 | FAS2750
  • AFF A300 | FAS8200
  • FAS8300
  • AFF A700 | FAS9000
  • AFF A900 | FAS9500
  • MSATA启动设备(P/N X3320A)
  • 无法启动ONTAP 操作系统

问题描述

启动设备遇到问题描述的节点将发生崩溃或无法启动ONTAP。

崩溃字符串

PANIC: thread (xxx) on cpu hung for 4001 milliseconds in process xxx

PANIC: thread (swi4: clock) on cpu 0 hung for 4001 milliseconds in process swi4: clock on release 9.5P1 (C) on Thu Jun 11 03:41:13 EDT 2020
 
PANIC: thread (md0.uzip) on cpu 0 hung for 4001 milliseconds in process md0.uzip on release 9.3P20 (C) on Tue Feb  2 22:28:58 JST 2021

Panic string: Process mgwd unresponsive for 620 seconds in process

PANIC: g_vfs_done(): rootfs.uzip read error - suspect boot device in process g_up on release 9.7P4 (C) on Tue Aug 11 02:55:33 EDT 2020

EMS日志

在EMS日志中发生崩溃之前、您可能还会看到以下内容:

 cam: cam.timeout.retry:notice]: CAM device driver I/O timeout. Details: CAM command timeout (retrying), Device ada0 (4 retries left). Device ID: SPG2153013S. Command: WRITE_DMA. ACB: ca 00 bb 88 88 40 00 00 00 00 70 00.
启动失败期间的控制台启动消息

以下是在尝试ONTAP启动期间可能会出现的控制台消息。

  • 节点无法启动、并出现控制台日志启动设备错误:

g_vfs_done():ada0s1[READ(offset=1754742784, length=8192)]error = 5
ada0: CAM_CMD_TIMEOUT, retrying (4 retries left)
ada0: CAM_CMD_TIMEOUT, retrying (3 retries left)
ada0: CAM_CMD_TIMEOUT, retrying (2 retries left)
ada0: CAM_CMD_TIMEOUT, retrying (1 retry left)
Unable to read directoryRetry #5 of 5: /sbin/fsck_msdosfs /dev/ada0s1
bootarg.init.verify_checksum="true"
** /dev/ada0s1
** Phase 1 - Read and Compare FATs
** Phase 2 - Check Cluster Chains
** Phase 3 - Checking Directories
Unable to read directory
********************************************************
* ALERT: Boot device file system corruption detected:  *
* FAT filesystem check exited with status   8          *
*                                                      *
* File system consistency check failed.                *
* The node will be halted.                             *
* Contact Support for assistance.                      *
********************************************************
Uptime: 17s
System halting...

  • 无法重新启动示例:
Could not load fat://boot0/X86_64/freebsd/image1/kernel:Device not found ERROR: Error booting OS on: 'boot0' file: fat://boot0/X86_64/freebsd/image1 /kernel (boot0,fat) Could not load fat://boot0/X86_64/freebsd/image1/kernel:Device is open ERROR: Error booting OS on: 'boot0' file: fat://boot0/X86_64/freebsd/image1/kernel (boot0,fat)

ERROR: Failure to build endpoint environment variables. Autoboot will be disabled.

  • 如果平台是一个不可用的平台、则无法保存核心

什么是savecore和navecore AFF和FAS平台?

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.