由于vipmanager进程失败、集群加入失败
执行
适用场景
- ONTAP 9.0
- ONTAP 9.1
问题描述
将新节点加入 ONTAP 9.0或9.1集群 可能会失败、并显示以下错误消息:
Updating LIF Manager ........................Error: Failed to create Default Broadcast domain. Timeout: Operation "vifmgr_broadcast_domain_perform_cluster_join_iterator::create_imp()" took longer than 25 seconds to complete.
要确认根发生原因、请检查以下内容:
1.确认加入 更新默认广播域任务失败
cluster::*> set -privilege diagnostic
cluster::*> debug cluster-join show
Task ID SubTask ID Status Tries Failures
-------------------- ------------------------- ---------- ----- -------
pre-setup check-unused-cluster-ports success 20 0
pre-setup mtu-check success 20 0
pre-setup ping-local success 20 0
pre-setup ping-remote success 20 0
pre-setup mtu-subnet-test success 20 0
pre-setup rpc-check success 20 0
pre-setup capability-check success 20 0
network-setup check-node-mgmt-mtu success 20 0
network-setup rename-lifs-nodeuuid success 20 0
network-setup relabel-lifs success 20 0
network-setup ping-local2 success 20 0
network-setup limit-check success 20 0
node-check check-for-mroot success 20 0
node-check ha-mode-check success 20 0
node-check sfo-partner-check success 20 0
node-check platform-check success 20 0
node-check license-check success 20 0
node-check join-switchless success 20 0
node-check get-node-time success 20 0
node-check get-node-name success 20 0
node-check check-node-name success 20 0
node-check resolve-aggr-names success 20 0
node-check cluster-ha-check success 20 0
system-initialize system-initialize success 20 0
cluster-join join-site-list success 20 0
cluster-join wait-for-rdb-online success 20 0
cluster-join wait-for-rdb-databases success 20 0
cluster-join create_cluster_version_entries success 20 0
cluster-join upload-capability success 20 0
system-startup system-startup success 20 0
check-cluster-apps vldb success 20 0
check-cluster-apps lifmgr success 20 0
check-cluster-apps bcom success 20 0
vldb-update register-aggregates success 20 0
vldb-update register-volumes success 20 0
vifmgr-update update-default-broadcast-domain failure 20 1
nonshared-clus-setup nonshared-clus-setup - 0 0
miscellaneous rename-lifs-nodename - 0 0
miscellaneous get-location - 0 0
miscellaneous register-mgwd-dsmfp-service - 0 0
miscellaneous file-replication - 0 0
miscellaneous subscribe-host-based-keys - 0 0
miscellaneous subscribe-systemshell-ssh-keys - 0 0
miscellaneous motd-and-banner-join - 0 0
miscellaneous nvfail-setup - 0 0
miscellaneous node-http-config - 0 0
miscellaneous upload-licenses-v2 - 0 0
miscellaneous remove-precluster-cert - 0 0
miscellaneous bandwidth-check - 0 0
miscellaneous dummy-task - 0 0
finished finished failure 20 2
2. 在/mroot/etc/log/mllog/vifmgr.log中搜索"FG_UPDATE_JOin"消息、其中秒数较大且持续增加:
[src/rdb/TM.cc 2916 (0x8127f0100)]: RW-transaction TID <16,17042,17042> held by client 1022 for 81795 seconds (created: 1995072s, now: 2076867s) (label 'fg_update_join').
[src/rdb/TM.cc 2916 (0x811c0ec00)]: RW-transaction TID <16,17042,17042> held by client 1022 for 81805 seconds (created: 1995072s, now: 2076877s) (label 'fg_update_join').