由于丢失 ACP 通信而重新启动 SP
适用场景
- FAS2240
- FAS2220
问题描述
- 清除ACP警报问题描述后重新启动SP。EMS日志 示例:
- 节点 1
[node_name-01: dsa_worker1: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 2: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top right, on shelf module B. [node_name-01: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors. [node_name-01: monitor: monitor.globalStatus.critical:EMERGENCY]: Disk shelf fault. [node_name-01: dsa_worker2: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 1: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A. [node_name-01: dsa_worker2: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status. [node_name-01: dsa_worker1: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status. [node_name-01: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected. [node_name-01: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.
- 节点 2
[node_name-02: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors. [node_name-02: monitor: monitor.globalStatus.critical:EMERGENCY]: Disk shelf fault. [node_name-02: dsa_worker1: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 1: critical status ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A. [node_name-02: dsa_worker3: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status. [node_name-02: dsa_worker2: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status. [node_name-02: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected. [node_name-02: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.
- SP自动重新启动并显示事件消息 示例:
Record 833: Tue Oct 13 18:20:19 2020 [SP.critical]: Rebooting SP due to loss of ACP comms
- ACP状态正常、运行正常。
- 通过管理e0M端口传输的帧数和字节/秒数较高:
-- interface e0M (30 days, 20 hours, 46 minutes, 42 seconds) --
RECEIVE
TRANSMIT
>>>Total frames: 2992m | Frames/second: 1122 | Total bytes: 4523g
Bytes/second: 1696k | Total errors: 0 | Errors/minute: 0
Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 90594
-- interface e0M (30 days, 20 hours, 44 minutes, 31 seconds) --
RECEIVE
TRANSMIT
>>>Total frames: 216m | Frames/second: 81 | Total bytes: 322g
Bytes/second: 120k | Total errors: 0 | Errors/minute: 0
Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 90526
…
- 节点管理LIF和集群间LIF共享同一广播域。