为适配器生成的 FCP 目标 SRAM 转储
适用场景
- 所有 NetApp FAS / AFF 硬件平台
- 所有 Data ONTAP 版本
问题描述
- 定期重置(称为 FCP 目标适配器的FCP 转储)会影响两个端口
事件日志:
Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_0: scsitarget.fcp.dump:debug]: FCP target SRAM dump generated for adapter 1a, FW Initiated Dump Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_0: scsitarget.hwpfct.errorReset:notice]: An error was encountered in the FC target driver on Fibre Channel target adapter 1a. The adapter will be automatically reset to clear the status:0x87800000, status1:0x52004c62, status2:0x610102, DIP:1, RN:1, RDY:1, Dump owner:0 condition. Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_1: scsitarget.hwpfct.errorReset:notice]: An error was encountered in the FC target driver on Fibre Channel target adapter 1b. The adapter will be automatically reset to clear the status:0x87800000, status1:0x52004c62, status2:0x610102, DIP:1, RN:1, RDY:1, Dump owner:0 condition. Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_1: ems.engine.suppressed:debug]: Event 'fcp.io.status' suppressed 1 times in last 127375 seconds. Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_1: fcp.io.status:debug]: STIO Adapter 1b resetting with 263 ITNs and 212 commands to drain Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_1: scsitarget.fct.reset:notice]: Resetting Fibre Channel target adapter 1b. Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_0: scsitarget.hwpfct.dump.saved:notice]: A dump for adapter 1a was stored in /etc/log/fctsli_1a_20220120_205904/fct_fw_1a.dmp.gz. Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_0: callhome.fcp.sram.dump:error]: Call home for FCP SRAM DUMP. Thu Jan 20 20:59:04 +0100 [Node-05: fct_tpd_thread_0: scsitarget.fct.reset:notice]: Resetting Fibre Channel target adapter 1a.
- 在 FCP SRAM 事件之后,我们可以看到下面的 STIO 事件:
STIO Adapter 1b resetting with 263 ITNs and 212 commands to drain
- EMS / event log show存在与适配器端口相关的错误和警报
IO wqe故障(示例):
[NODE-02: fct_tpd_work_thread_0: fcp.io.status:debug]: STIO Adapter:5a IO WQE failure, Handle 0x0, Type 8, S_ID: 610601, VPI: 3, OX_ID: 18D, Status 0x3 Ext_Status 0x16
fcp.io.status:state=5(示例):
[NetApp01: fct_tpd_thread_0: fcp.io.status:debug]: STIO Adapter:0d, found hung cmd:0xffffff01dd049f90(state=5, flags=0x0, ctio_sent=1/1,RecvExAddr=0x11e4e0, OX_ID=0x4418, RX_ID=0xffff,SID=0x7004c, Cmd[2A])