系统无法启动,无法恢复数据复制模块的本地数据库
适用于
- 所有 AFF 和 FAS 平台
- ONTAP 9
- ONTAP Select
- 断电、控制器重新定位,或在升级或其他重新启动期间
- 在修复 ONTAP Select 产品中的底层虚拟磁盘文件之前,错误地取消了磁盘故障
问题
- 节点的控制台日志显示:
******************************************************* This is a serial console session. Output from this ** session is mirrored on the SP console session. ******************************************************************************** SYSTEM MESSAGES *************************Internal error: Cannot open corrupt replicated database. Automatic recoveryattempt has failed or is disabled. Check the event logs for details. This nodeis not fully operational. Contact support personnel for the root volume recoveryprocedures.- 节点在启动过程中可能会死机:
Warning: previous shutdown was dirty, there is a possible loss of data.
May 28 00:43:33 [node1:wafl.root.content.changed:error]: Contents of the root volume 'vol0' might have changed. Verify that all recent configuration changes are still in effect.
PANIC : NVRAM contents are invalid...
PANIC: NVRAM contents are invalid... in SK process rc on release 9.10.1P5 (C) on Wed May 28 00:43:33 GMT 2025
- 节点的 EMS 消息:
Mar 28 17:03:46 [NODE_B:rdb.recovery.failed:EMERGENCY]: Error: Unable to find a master. Unable to recover the local database of Data Replication Module: Management.Mar 28 17:03:46 [NODE_B:spm.mgwd.process.exit:EMERGENCY]: Management Gateway (mgwd) subsystem with ID 1944 exited as a result of signal normal exit (0). The subsystem will attempt to restart.- 节点启动后可以登录,但集群命令没有显示正确的输出:
::> cluster show错误:"show" 是无法识别的命令
::> set advanced::*> cluster ring showError: "show" is not a recognized commandROOT VOLUME NOT WORKING PROPERLY: RECOVERY REQUIRED错误消息在引导/登录阶段显示。- 在 LOADER 中为
corrupt和/或recovery声明了一些bootarg值。
示例:
LOADER-B> printenv bootarg.rdb_corruptVariable Name Value-------------------- --------------------------------------------------bootarg.rdb_corrupt 0500000000LOADER-B> printenv bootarg.init.boot_recoveryVariable Name Value-------------------- --------------------------------------------------bootarg.init.boot_recovery 80注1: 如果未设置 bootarg,则显示为 undefined。
注2:也可以通过 AutoSupport
KENV部分检查相同的值。示例:
bootarg.rdb_corrupt="5550055000"