SCSI.cmd.abordedByHost:延伸型MetroCluster上报告错误
适用场景
- ONTAP 9
- 双节点桥接延伸型MetroCluster
问题描述
- 在节点Node-01上、通过路径2b可以看到不同磁盘驱动器发生以下"SCSI.cmd.abordedByHost"事件
Thu Mar 20 16:35:06 +0100 [Node-01: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 2b.125L62: Command aborted by host adapter: HA status 0x4: cdb 0xea:0bbc6000:01e0.
Thu Mar 20 16:35:09 +0100 [Node-01: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 2b.125L55: Command aborted by host adapter: HA status 0x4: cdb 0x2a:8bb9edf8:0008.
- 节点ND-01的FC端口2D频繁闪烁、并报告StorageFCAdapterFault_Alert、原因是:
Thu Mar 20 16:36:29 +0100 [Node-01: slifc_asyncd_4: fci.adapter.link.online:info]: Fibre Channel adapter 2b link online.
Thu Mar 20 16:37:10 +0100 [Node-01: slifc_timeout_4: fci.link.error:error]: Could not recover link on Fibre Channel adapter 2b after 30 seconds. Taking the adapter offline.
Thu Mar 20 16:37:10 +0100 [Node-01: dsbridge_admin: bridge.removed:info]: FC-to-SAS bridge 2b.125L0 [ATTO FibreBridge7600N 4.35] S/N [FB7600N106192] was removed.
Thu Mar 20 16:37:20 +0100 [Node-01: nchmd: hm.alert.raised:alert]: Alert Id = StorageFCAdapterFault_Alert , Alerting Resource = 100000109b4ede02 raised by monitor node-connect
Thu Mar 20 16:51:11 +0100 [Node-01: slifc_asyncd_4: fci.adapter.online:info]: Fibre Channel adapter 2b is now online.
Thu Mar 20 16:51:27 +0100 [Node-01: dsbridge_admin: bridge.discovered:info]: FC-to-SAS bridge 2b.125L0 [ATTO FibreBridge7600N 4.35] S/N [FB7600N106192] was discovered.
- 当FC端口2b关闭时、无法访问ATto网桥FB7600N106192、因此节点正在过渡到混合路径配置、并报告以下事件:
Thu Mar 20 16:48:44 +0100 [Node-01: svc_queue_thread: callhome.dsk.redun.fault:error]: Call home for DISK REDUNDANCY FAILED
Thu Mar 20 16:49:24 +0100 [Node-01: dsa_disc: ses.multipath.ReqError:alert]: SAS disk shelf detected without a multipath configuration.
Thu Mar 20 16:50:03 +0100 [Node-01: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process nchm: SinglePathToDiskShelf_Alert[2937244207926544976].
- ATto端口统计信息指示FC端口1上的链路故障、同步丢失和CRC错误
FC Port 2100001086b11d80:
State: up
Speed: 16 Gb/s
Topology: point-to-point
Link Failure Count: 263<--------------------
Loss of Sync Count: 492019335
CRC Error Count: 10967
LIP Count: 0
Frames In: 17894
Frames Out: 24957655
SFP Vendor: AVAGO
SFP Part Number: AFBR-57G5MZ-ELX
SFP Serial Number: AN2138G016M
SFP Capabilities: 8, 16,
- 在节点Node-02上、我们会看到适配器1b上报告过多错误、同时还会显示"SCSI.cmd.abordedByHost"错误:
Mon Mar 17 18:20:45 +0100 [Node-02: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 1b.125L8: Command aborted by host adapter: HA status 0x4: cdb 0x2a:1da79600:0200.
Mon Mar 17 18:31:03 +0100 [Node-02: slifc_intrd: scsi.cmd.abortedByHost:error]: Disk device 1b.125L3: Command aborted by host adapter: HA status 0x4: cdb 0x28:59974688:0008.
Thu Mar 20 01:27:22 +0100 [Node-02: slifc_intrd: scsi.path.excessiveErrors:error]: Excessive errors encountered by adapter 1b on disk device 1b.125.
Thu Mar 20 01:27:22 +0100 [Node-02: slifc_intrd: scsi.cmd.transportErrorEMSOnly:error]: Disk device 1b.125L30: Transport error during execution of command: HA status 0x9: cdb 0x28:84756688:0088.