Nblade.vldb.Timeout导致服务中断
适用场景
- ONTAP 9
问题描述
发生意外LIF故障转移、系统随后指示
nodewatchdog.svc.rpc.noresp, Nblade.vldb.Timeout, ucore.panicString
消息。数据服务不可用。示例:
clustermgt::*> network interface show -is-home !true
Logical Status Network Current Current Is
Vserver Interface Admin/Oper Address/Mask Node Port Home
----------- ---------- ---------- ------------------ ------------- ------- ----
cluster
cluster-1 up/up xxx.xxx.xxx.xxx/24 clustermgt-03 a0a false
cluster-2 up/up xxx.xxx.xxx.xxx/24 clustermgt-03 a0a false
cluster-3 up/up xxx.xxx.xxx.xxx/24 clustermgt-03 a0a false
cluster-4 up/up xxx.xxx.xxx.xxx/24 clustermgt-03 a0a false
clustermgt
clustermgt-00 up/up xxx.xxx.xxx.xxx/24 clustermgt-02 e0M false
clustermgt_01 ALERT nodewatchdog.svc.rpc.noresp: The mgwd service internal to Data ONTAP that is required for continuing data service was unavailable. The service failed, but was unsuccessfully restarted.
clustermgt_01 ERROR Nblade.vldb.Timeout: Request from the network module to the volume location database (VLDB) timed out.
clustermgt_01 ERROR ucore.panicString: 'mgwd: Received SIGABRT (Signal 6) at RIP 0x8195c030c (pid 2409, uid 0, timestamp 1612452192).'
clustermgt_01 ERROR ucore.error: Could not dump core for process ID 2409 (mgwd). Reason: coredump "/mroot/etc/crash/mgwd.2409.0537090897.1612452192.ucore" failed: error 4.
clustermgt_01 ALERT nodewatchdog.svc.rpc.noresp: The mgwd service internal to Data ONTAP that is required for continuing data service was unavailable. The service failed, but was restarted.