CAIQUM-5673:由于acq.log中的AMQP错误、AIQUM 9.14及更高版本中的集群采集失败
问题描述
- 对于所有集群、在Active IQ Unified Manager (AIQUM) 9.14及更高版本上采集失败
Cluster discovery failed. Rediscover the cluster after resolving the issue
Unable to add cluster datasource. This can occur if the clocks on the systems are not synchronized and the Active IQ Unified Manager HTTPS certificate start date is later than the date on the cluster, or if the cluster has reached the maximum number of EMS notification destinations.
au.log
表示采集失败或基线错误WARN [Thread-4] c.n.a.d.a.DataCollectorTrigger (DataCollectorTrigger.java:210) - Failed to connect to Broker, will retry shortly
INFO [Thread-3] c.o.c.u.CredentialStoreUtils (CredentialStoreUtils.java:86) - Successfully retrieved the decrypted value
ERROR [Thread-3] o.a.q.j.JmsConnection (JmsConnection.java:165) - Failed to connect to remote at: amqp://localhost:56072
WARN [Thread-3] c.n.a.d.a.DataCollectorListener (DataCollectorListener.java:186) - Failed to connect to Broker, will retry shortlyERROR [common-pool-569] c.o.s.a.f.d.BaseDataSource (DataSourceErrorException.java:244) - clustername [Error retrieving data] - Failed to send Baseline request to cluster! ([Device name General Device]: Failed to send Baseline request to cluster!)
2025-03-24 14:58:56,015 ERROR [common-pool-6244] c.o.s.a.f.d.BaseDataSource (DataSourceErrorException.java:246) - cluster-mgmt-ip [Internal error] - Failed in conversion ([Device name General Device]: Failed in conversion)
/jboss/server_acq.log
也指示错误ERROR [default task-XXXX] c.n.u.CloudAgentConnectionUtil (CloudAgentConnectionUtil.java:XXX) - Failed to establish connection for cloud agent instance "UnifiedManager_XXXXXX_XXXXXX_XXXXXXX_XXXXXXX".
- AIQUM的事件管理中会显示事件
Event: Cluster Monitoring Failed.
Monitoring failed for cluster cluster. Reason: Incorrect response from one or more Data ONTAP APIs. Contact technical support. Details: Failed to send Baseline request to cluster!
- 采集在一天中的特定时间失败、该时间在重新启动/重新启动服务器后暂时修复、但再次失败
java.lang.OutOfMemoryError
解析CCMA文件时在au.log
中检测到INFO [async-perf-pool-0] c.n.a.d.a.DataCollectorConnection (DataCollectorConnection.java:133) - DataCollectorConnection: wait() <UUID>, removing message from messageMap
ERROR [archive-file-parser-4] c.o.s.a.d.n.a.p.ArchiveFileParser (ArchiveFileParser.java:152) - Message :Error parsing CCMA file : /var/log/ocie/recording/cloudagent/performance/<IP>/<TIMESTAMP>/XXXXX.ccma.gz, java.lang.OutOfMemoryError:Java heap space Exceptions : java.lang.OutOfMemoryError: Java heap space
ERROR [archive-file-parser-3] c.o.s.a.d.n.a.p.ArchiveFileParser (ArchiveFileParser.java:152) - Message :Error parsing CCMA file : /var/log/ocie/recording/cloudagent/performance/<IP>/<TIMESTAMP>/XXXXX.ccma.gz, java.lang.OutOfMemoryError:Java heap space Exceptions : java.lang.OutOfMemoryError: Java heap space
INFO [async-perf-pool-0] c.o.s.a.d.n.a.c.ArchiveFilePairingUtil (ArchiveFilePairingUtil.java:49) - Processing 13 sample(s).
ERROR [Force Acq Listener] c.o.s.a.f.m.ServiceManager (ServiceManager.java:480) - Data Source Events Manager reported an error: Failed to listen to acquisition events! java.lang.OutOfMemoryError: Java heap space