跳转到主内容

HA 对中的两个节点均因断电而重新启动

Views:
11
Visibility:
Public
Votes:
0
Category:
aff-series<a>2008909305</a>
Specialty:
hw
Last Updated:

适用场景

  • FAS 系统
  • AFF 系统

问题描述

  • HA对中的两个节点会同时重新启动。
  • EMS日志 示例 (在两个节点中重复)显示了两个PSU中的直流电压过低和交流故障:

[node_name: dsa_worker3: ses.status.psWarning:error]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power warning for Power supply 1: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom left.
[node_name: dsa_worker4: ses.status.psError:alert]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power error for Power supply 1: critical status; AC Fail. This module is on the rear of the shelf at the bottom left.
[node_name: dsa_worker4: callhome.shlf.power.intr:error]: Call home for SHELF POWER INTERRUPTED
[node_name: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0b. Check fans, power supplies, disks, and temperature sensors.
[node_name: power_low_monitor: monitor.chassisPower.degraded:alert]: Chassis power is degraded: Power Supply Status Critical: PSU1.
[node_name: power_low_monitor: callhome.chassis.power:error]: Call home for CHASSIS POWER DEGRADED: Power Supply Status Critical: PSU1.
[node_name: monitor: monitor.globalStatus.critical:EMERGENCY]: Power Supply Status Critical: PSU1. Disk shelf fault.
[node_name: dsa_worker2: ses.status.psInfo:info]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power supply information for Power supply 1: normal status.
[node_name: monitor: monitor.globalStatus.critical:EMERGENCY]: Power Supply Status Critical: PSU1.
[node_name: power_low_monitor: monitor.chassisPowerSupplies.ok:info]: Chassis power supplies OK.
[node_name: dsa_worker0: ses.status.psWarning:error]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power warning for Power supply 2: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom right.
[node_name: dsa_worker2: ses.status.psWarning:error]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power warning for Power supply 2: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom right.
[node_name: dsa_worker2: callhome.shlf.ps.fault:error]: Call home for SHELF POWER SUPPLY WARNING
[node_name: dsa_worker0: ses.status.psWarning:error]: DS224-12 (S/N 012345678910) shelf 0 on channel 0b power warning for Power supply 2: warning status; DC undervoltage. This module is on the rear of the shelf at the bottom right.
[node_name: dsa_worker3: callhome.shlf.ps.fault:error]: Call home for SHELF POWER SUPPLY WARNING
[node_name: dsa_worker3: callhome.shlf.ps.fault:error]: Call home for SHELF POWER SUPPLY WARNING

  • 报告断电(在两个节点中同时重复)的BMC/SP事件示例

Record 2435: Mon Dec 05 22:33:43.000000 2022 [BMC.emergency]: System input power lost
Record 2436: Sun Jan 01 00:00:22.310000 2017 [IPMI.notice]: 05f2 | c0 | OEM: ffff7000ff00 | ManufId: 150300 | BMC Power Reset
Record 2437: Sun Jan 01 00:00:22.330000 2017 [IPMI.notice]: 05f3 | c0 | OEM: fcff70560000 | ManufId: 150300 | POS Register: Power on Reset(Normal Power Cycle)

Record 1596: Sat Sep 11 08:03:16 2021 [SP.emergency]: System input power lost
Record 1597: Thu Jan  1 00:00:32 1970 [IPMI.notice]: ce01 | c0 | OEM: ffff7000ff00 | ManufId: 150300 | SP Power Reset
Record 1598: Thu Jan  1 00:00:32 1970 [IPMI.notice]: cf01 | c0 | OEM: fcff70560000 | ManufId: 150300 | POS Register: Power on Reset(Normal Power Cycle)

  • BMC/SP系统日志报告电源问题(在两个节点上同时重复)示例

BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x32 dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x34 dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC hsam[1426]: FRU /chassis-1 LED on
BMC hsam[1426]: FRU /chassis-1/controller-b/cna-3 LED on
BMC hsam[1426]: HSAM OS(bmc):cmd(set) FLD(cna-4):fault(Overcurrent Protection Fault)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x5b dir:3) match (15) ALERT
BMC hsam[1426]: FRU /chassis-1 LED on
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC hsam[1426]: FRU /chassis-1/controller-b/cna-4 LED on
BMC hsam[1426]: HSAM OS(bmc):cmd(set) FLD(cna-1):fault(Overcurrent Protection Fault)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x5d dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: EventFilter: event on sensor(#0x5e dir:3) match (15) ALERT
BMC IPMIMain[1142]: [1142 : 1167 INFO]PEF.c: Power Action:needed(0) action(0); Alert Action: needed(1) action(17)

  • 在重新安装或更换PSU和/或控制器后、问题描述 仍会保留。

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.