跳转到主内容

由于磁盘架电源出现故障、发生HA对严重事件并重新启动

Views:
1
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
hw<a>2009年860074</a>
Last Updated:

适用场景

  • ONTAP 9
  • NS224

问题描述

  • 由于无法访问磁盘、HA对中的两个节点都会重新启动:

Sat Nov 25 10:17:05 +0000 [netapp01-01: fmmbx_instanceWorker: cf.multidisk.fatalProblem:error]: Node encountered a multidisk error or other fatal error while waiting to be taken over. Permanent errors on all HA mailbox disks (while marshalling header).

Sat Nov 25 10:17:06 +0000 [netapp01-02: fmmbx_instanceWorker: sk.panic:alert]: Panic String: Permanent errors on all HA mailbox disks (while marshalling header) in SK process fmmbx_instanceWorker on release 9.11.1P8 (C)

  • 连接到磁盘架的存储端口的链路关闭警报:

Sat Nov 25 10:15:39 +0000 [netapp01-01: kernel: netif.linkDown:info]: Ethernet e10b: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-01: intr: netif.linkDown:info]: Ethernet e10b-30: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-01: kernel: netif.linkDown:info]: Ethernet e2a: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-01: intr: netif.linkDown:info]: Ethernet e2a-30: Link down, check cable.

Sat Nov 25 10:15:39 +0000 [netapp01-02: kernel: netif.linkDown:info]: Ethernet e2a: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-02: intr: netif.linkDown:info]: Ethernet e2a-30: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-02: kernel: netif.linkDown:info]: Ethernet e10b: Link down, check cable.
Sat Nov 25 10:15:39 +0000 [netapp01-02: intr: netif.linkDown:info]: Ethernet e10b-30: Link down, check cable.

  • 在AutoSupport EMS日志中可能看不到磁盘架电源故障
  • 磁盘架日志报告来自电源管理器的PSU警报:

Sat Nov 25 10:14:42 2023 (  148+23:59:17.135); 030B0060; S1; ENC_MGT; power_manager; 04; PCM 2 local fan power restored
Sat Nov 25 10:14:42 2023 (  148+23:59:17.135); 030B0084; S1; ENC_MGT; power_manager; 02; Clearing PSU AC Missing (non-redundant) alarm
Sat Nov 25 10:14:43 2023 (  148+23:59:18.126); 030B005C; S1; ENC_MGT; power_manager; 04; PCM 2 fault cleared, assume power restored (1600W)
Sat Nov 25 10:14:43 2023 (  148+23:59:18.126); 030B0078; S1; ENC_MGT; power_manager; 02; Clearing PSU Fail (non-redundant) alarm
Sat Nov 25 10:14:51 2023 (  148+23:59:26.123); 030B006F; S1; ENC_MGT; power_manager; 02; PCM 1 DC FAILURE Fault Detected
Sat Nov 25 10:14:51 2023 (  148+23:59:26.123); 030B0072; S1; ENC_MGT; power_manager; 02; Setting FAIL MIN REDUNDANT alarm for PCM 1
Sat Nov 25 10:14:51 2023 (  148+23:59:26.123); 030B005B; S1; ENC_MGT; power_manager; 04; PCM 1 faults indicate loss of power (1600W)
Sat Nov 25 10:14:52 2023 (  148+23:59:27.124); 030B005C; S1; ENC_MGT; power_manager; 04; PCM 1 fault cleared, assume power restored (1600W)
Sat Nov 25 10:14:52 2023 (  148+23:59:27.124); 030B0076; S1; ENC_MGT; power_manager; 02; Clearing PSU Fail (min-redundant) alarm
Sat Nov 25 10:14:55 2023 (  148+23:59:30.135); 030B006F; S1; ENC_MGT; power_manager; 02; PCM 2 PCM FAILURE Fault Detected
Sat Nov 25 10:14:55 2023 (  148+23:59:30.135); 030B0072; S1; ENC_MGT; power_manager; 02; Setting FAIL MIN REDUNDANT alarm for PCM 2
Sat Nov 25 10:14:55 2023 (  148+23:59:30.135); 030B006F; S1; ENC_MGT; power_manager; 02; PCM 2 TURNED OFF Fault Detected

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.