ONTAP 节点出现故障、启动时不会出现错误: "Panic : Process VIFMgr Unresponsive in Process NodeWatchdog on Release 9.1P12 (在版本 9.1P12 上的进程节点看门狗中,进程 VIFMgr 在 xxx 秒内无响应)
适用于
- ONTAP 9.x
- Data ONTAP 8.x
- 7-模式 Data ONTAP
问题
- 由于 VIFMgr 在看门狗超时之前没有响应、并且在重新启动 MDB 时无法恢复该节点,因此该节点出现故障:
Panic String: PANIC : Process vifmgr unresponsive for 629 seconds version: 9.1P12
-
由于卷 0 上的快照增量和快照空间利用率较高而填满根卷,这是由于运行了 2 周的滚动数据包跟踪所导致的:
Mon Mar 30 08:28:37 CDT [nodename: rshd_0: kern.cli.cmd:debug]: Command-line input: The command is 'pktt'. The full command line is 'pktt start a0a-10 -d /etc/crash -m 9018 -b 8m -s 2g -r 12'.
- 在出现紧急情况之前、控制台日志显示 VIFMgr 和 VLDB 崩溃、无法重新启动:
Apr 13 00:49:45 [nodename:spm.vldb.process.exit:EMERGENCY]: Volume Location Database(VLDB) subsystem with ID 34409 exited as a result of signal normal exit (1). The subsystem will attempt to restart.
Apr 13 00:49:47 [nodename:spm.vifmgr.process.exit:EMERGENCY]: Logical Interface Manager(VifMgr) with ID 34415 aborted as a result of signal normal exit (1). The subsystem will attempt to restart.
-
当节点重新引导时,由于卷 0 上缺少空间而无法恢复 MDB :
Apr 13 02:54:46 [nodename:callhome.mdb.recovery.unsuccessful:EMERGENCY]: Call home for MDB RECOVERY UNSUCCESSFUL FOR THE notifyd WARNING.
ln: /var/zoneinfo/zoneinfo: No space left on device
root: Unable to ln /mroot/etc/zoneinfo to /var/zoneinfo - error code(1)
/usr/bin/plxcoeff_log: cannot create /mroot/etc/log/plxcoeff/plxcoeff.log.tmp: No space left on devicestat: /mroot/etc/log/plxcoeff/plxcoeff.log.tmp: stat: No such file or directory