由于ONTAP 集群存在网络滞后、AIQ Unified Manager发出"集群监控失败"或"集群无法访问"警报
适用场景
- Active IQ Unified Manager 9.6及更高版本(UM)
- OnCommand统一管理器6x/7x/9.x (UM)
- 所有操作系统平台
问题描述
Cluster monitoring failed
随机收到随机集群警报Cluster cannot be reached
电子邮件警报会间歇性发出、并在后续轮询后废弃- au.log:
2020-03-19 06:35:07,290 ERROR [pool-3-thread-956] c.o.s.a.d.n.NetAppOCIEArchivePerformancePackage (NetAppOCIEArchivePerformancePackage.java:307) - Failed to get archive file names from zapi. java.net.ConnectException:
Connection timed out (Connection timed out)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
[...]
... 20 more
Wrapped by: com.onaro.sanscreen.acquisition.framework.datasource.DataSourceErrorException: Communication problem with the cluster: <cluster_ip>
at com.onaro.sanscreen.acquisition.framework.datasource.DataSourceErrorException.createWithEnhanced
DataSourceErrorException.java:73) ~[au-framework.jar:9.6.0-2019.06.J5087]
[...]
- ocumserver.log:
[com.netapp.ipc.jms.OCIE_Events] OCIE JMS notification message received: {WarningCount=0, DatasourceName=x.x.x.x, DatasourceID=12,
Error0_ClusterManagementIP=x.x.x.x, PackageName=netappfoundation, TotalReportTime=569, PollStartTime=1591613772703, ErrorCount=1,
Success=false, DurationTime=23248, Error0_Message=Failed to connect to the cluster., TotalZAPITime=-1, NotificationType=PACKAGE_COMPLETED, Error0_Type=NETWORK_ACCESS_FAILURE, UpdateTime=1591613796437, Error0_Port=443, MessageType=PACKAGE_NOTIFICATION,
Error0_Zapi=service-processor-get}