外壳出现 NO ACPP、INBACP、FAIL 错误
适用于
- ONTAP 9
- AFF/FAS系统
- IOM12E 货架模块
问题描述
- EMS 报告以下货架故障:
[Node-02: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[Node-01: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[Node-01: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[Node-02: statd: callhome.shlf.fault:error]: Call home for SHELF_FAULT
Or
[dsa_worker3: ses.status.ACPError:alert]: DS212-12 (S/N ****) shelf 0 on channel 0b ACP Processor error for SAS shelf ACP processor 1: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A.
[dsa_worker3: ses.status.ACPError:alert]: DS212-12 (S/N ****) shelf 0 on channel 0b ACP Processor error for SAS shelf ACP processor 2: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top right, on shelf module B.
[statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[statd: callhome.shlf.fault:error]: Call home for SHELF_FAULT
SP-LASTEST-IPMI报告Controller LED ON.
======================================
hsamcmd --fault-show-all
===============================
tag origin fld fault reason count time
---- ------- ---- ------------- ------ -----
1 bmc /chassis-1/controller-b SAS Expander has set the Controller LED ON 1
Fault Lights On:
/chassis-1 1
/chassis-1/controller-b 1
STORAGE-FAULT输出:
::> system node run -node <node_name> -command storage show fault
Enclosure Status: critical
Channel: 0a
Shelf: 0
Shelf Type: DS224-12
Product Serial Number: XXXXXXXXXXXXXXX
Module Type: IOM12E
Enclosure:
Element Status Status Bytes Status Descriptions
1: OK 01,00,02,00 FAIL
Vendor Unique Element 85-IOM12E: (ACP)
Element Status Status Bytes Status Descriptions
1 [IOM12E A] : CRITICAL 02,18,00,40 NO ACPP, INBACP, FAIL
2 [IOM12E B] : CRITICAL 02,18,00,40 NO ACPP, INBACP, FAIL
ENVIRONMENT日志显示无法从任何传感器读取信息。
Channel: 0aShelf: 0SES device path: local access: 0a.00.99Module type: IOM12E; monitoring is activeShelf status: information conditionSES Configuration, shelf 0: :Voltage Sensor installed element list: 1, 2, 3, 4; with error: noneShelf voltages by element: [1] <N/A> sensor condition <N/A> [2] <N/A> sensor condition <N/A> [3] <N/A> sensor condition <N/A> [4] <N/A> sensor condition <N/A>Current Sensor installed element list: 1, 2, 3, 4; with error: noneShelf currents by element: [1] 0 mA Normal current range [2] 0 mA Normal current range [3] 0 mA Normal current range [4] 0 mA Normal current rangeCooling Unit installed element list: 1, 2, 3, 4; with error: noneCooling Units by element: [1] 0 RPM [2] 0 RPM [3] 0 RPM [4] 0 RPM
- 服务处理器重新启动后,问题仍然存在。
- 在系统上禁用和重新启用 ACP 无济于事。
- IOM 的电源循环不能解决此问题。
- 重新拔插主板也会导致相同的错误。