跳转到主内容

由于用于双向 TLS 通信的 CA 证书已过期,AIQUM 中的集群获取失败

Views:
227
Visibility:
Public
Votes:
0
Category:
active-iq-unified-manager
Specialty:
om
Last Updated:

适用于

  • Active IQ Unified Manager (AIQUM) 9.12 及更高版本
  • ONTAP 9.5 及更高版本
  • 为 ONTAP 集群启用了双向传输层安全 (mTLS / Mutual TLS)

问题

  • AIQUM DASHBOARD 显示 Cluster discovery failed. Rediscover the cluster after resolving the issue.
  • 集群发现显示"Failed" 新添加的集群。
  • Operation State 是 Failed 用于 Health Poll 操作在 STORAGE MANAGEMENT > Cluster Setup 现有集群中
  • 问题可能在 ONTAP/AIQUM 发生变化后出现,例如,更新 ONTAP 版本会强制 AIQUM 重新交换证书
  • 事件 Cluster Monitoring Failed 和 Mutual TLS Certificate Expire 被触发
  • Cluster Monitoring Failed 事件触发很长时间后,卷或聚合容量的历史记录窗格显示 Insufficient Historical Data 而不是当前和趋势容量线。
  • 有时会阻止 OSM GUI 访问
  • 最近的性能图未显示
  • 容量信息未更新。
  • 配额信息未更新,超额事件未通知。
  • 现有关系的保护策略显示 PKIX path building failed
  • 最近的配置更改(如 qtree 创建)未反映
  • 快照还原尝试失败,出现 HTTP 错误 500,因为由于采集失败问题导致快照列表不可用
  • ocumserver.log 显示错误:

INFO [oncommand] [org.springframework.jms.listener.DefaultMessageListenerContainer#0-1] [com.netapp.ipc.jms.OCIE_Events] OCIE JMS notification message received: {WarningCount=0, DatasourceName=<cluster_name>, DatasourceID=1, Error0_ClusterManagementIP=<cluster_name>, PackageName=netappfoundation, TotalReportTime=-1, PollStartTime=1711675762833, ErrorCount=1, Success=false, DurationTime=554, Error0_Message=[Device name <cluster_name>]: Communication problem with the cluster: <cluster_name>, command: system-get-version, error: 'Received fatal alert: certificate_expired' on try 5 out of 5, TotalZAPITime=-1, NotificationType=PACKAGE_COMPLETED, Error0_Type=NETWORK_ACCESS_FAILURE, UpdateTime=1711675763398, Error0_Port=443, MessageType=PACKAGE_NOTIFICATION, Error0_Zapi=system-get-version}

  • au.log 显示错误:
    • ERROR [common-pool-XX] c.o.s.a.d.n.t.z.ZAPIConnection (ZAPIConnection.java:442) - [netappfoundation] <cluster_name> - Communication problem with the cluster: <cluster_name>, command: system-get-version, error: 'Received fatal alert: certificate_expired' on try 5 out of 5
    • WARN  [common-pool-131891] c.o.s.a.d.n.t.z.ZAPIConnection (ZAPIConnection.java:586) - [netappfoundation] <ONTAP_CLUSTER_IP> - <ONTAP_CLUSTER_IP><ONTAP_CLUSTER_IP> - SSL handshake error on system-get-version try 5 out of 5, Received fatal alert: certificate_expired javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_expired
    • WARN  [common-pool-4132] c.o.s.a.d.n.t.z.ZAPIConnection (ZAPIConnection.java:619) - [netappfoundation] <ONTAP_CLUSTER_IP> - while executing ZAPIs on datasource: <ONTAP_CLUSTER_IP> IP: <ONTAP_CLUSTER_IP> for ZAPI: system-get-version, javax.net.ssl.SSLException: Connection has closed: javax.net.ssl.SSLException: Software caused connection abort: socket write error java.net.SocketException: Software caused connection abort: socket write error
         at java.net.SocketOutputStream.socketWrite0(Native Method) ~[?:?]
         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:110) ~[?:?]
         at java.net.SocketOutputStream.write(SocketOutputStream.java:150) ~[?:?]
         at sun.security.ssl.SSLSocketOutputRecord.flush(SSLSocketOutputRecord.java:271) ~[?:?]
      ..
      ERROR [common-pool-4132] c.o.s.a.f.d.BaseDataSource (DataSourceErrorException.java:246) - <ONTAP_CLUSTER_IP> [Error connecting] - Communication problem with the cluster: <ONTAP_CLUSTER_IP> ([Device name <ONTAP_CLUSTER_IP>]: Failed to connect to the cluster.)

注意:如果 client-ca 证书被错误删除,au.log 会显示错误:

ERROR [common-pool-0] c.o.s.a.d.n.t.z.ZAPIConnection (ZAPIConnection.java:629) - [netappfoundation] <ONTAP_CLUSTER_IP> - while executing ZAPIs on datasource: <ONTAP_CLUSTER_IP> IP: <ONTAP_CLUSTER_IP> for ZAPI: system-get-version, netapp.manage.NaAuthenticationException: Authorization failed netapp.manage.NaAuthenticationException: Authorization failed

ERROR [common-pool-19] c.o.s.a.f.d.BaseDataSource (DataSourceErrorException.java:246) - <CLUSTER> [Invalid login credentials] - Failed to log in to the cluster: <CLUSTER> ([Device name <CLUSTER>]: Failed to login to the cluster.)

  • ONTAP 报告 mgmtgwd.certificate.expired 和/或 mgmtgwd.certificate.expiring EMS 事件
    • [Node_Name: mgwd: security.invalid.login:alert]: Failed to authenticate login attempt to Vserver: <vserver_name>, username: null, application: ontapi. audit-mlog shows: [kern_audit:info:3385] 8503e8000065373d :: <cluster_name>:ontapi :: <AIQUM_IP>:52346 :: <cluster_name>:null :: Login Attempt :: Error: Authentication failed
    • [Nodename: mgwd: mgmtgwd.certificate.expired:error]: A digital certificate with Fully Qualified Domain Name (FQDN) admin, Serial Number xxxxxxxxxxx, Certificate Authority 'admin' and type client-ca for Vserver svm0 has expired.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.