由于交换机端SFP故障,Brocade交换机端口抖动
适用于
- 所有 Brocade 交换机硬件平台
- 所有 Brocade Fabric Operating System (FOS) 固件级别
- 后端 MCC 交换机
问题描述
- 交换机端口状态显示为"
online",位于switchshow":
/fabos/bin/switchshow :
Index Slot Port Address Media Speed State Proto
============================================================
75 9 11 0f4b40 id N32 Online FC F-Port 10:00:94:40:c9:cf:4a:b1
C3 discards, C3timeout TX errors, link fail, loss sync和uncorr错误在porterrshow下报告:
porterrshow 9/11
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
75: 204.0m 293.1m 0 0 0 0 0 0 0 540 447 284 1 0 0 5400 0 0 96
Switchshow输出显示端口处于"In_Sync"状态-
Index Slot Port Address Media Speed State Proto
============================================================
74 9 10 674a40 id N16 In_Sync FC
- 在 Brocade 交换机端口在
sfpshow输出下报告了低 Tx 功耗。
=============
Port 4:
=============
Length 62.5u: 0 (units 10 meters)
Length Cu: 0 (units 1 meter)
Vendor Name: BROCADE
Vendor OUI: 00:05:1e
Vendor PN: 57-0000088-01
Vendor Rev: A
Wavelength: 850 (units nm)
Options: 003a Loss_of_Sig,Tx_Fault,Tx_Disable
BR Max: 0
BR Min: 0
Serial No: HAF618230000T4B
RX Power: -3.0 dBm (501.1uW)
TX Power: -7.1 dBm (195.8 uW)
- 链接重置(
LR_OUT)在Fabriclog输出和offline下报告,正在报告online事件
Switch 0; Fri Nov 11 11:23:12 2022 IST (GMT+5:30)
22:52:28.290358 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
23:08:39.902472 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
23:17:38.738930 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
23:22:26.529633 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
23:24:29.226184 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
23:25:11.419546 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
23:25:53.721693 SCN LR_PORT(0);g=0x19ee LR_OUT A2,P0 A2,P0 75 NA
15:43:33.967361 SCN Port Offline;rsn=0x2,g=0x2e50 A2,P0 A2,P0 78 NA
15:43:33.967370 *Removing all nodes from port A2,P0 A2,P0 78 NA
15:43:34.615134 SCN LR_PORT(0);g=0x2e50 A2,P0 A2,P0 78 NA
15:43:34.694264 SCN Port Online; g=0x2e50,isolated=0 A2,P0 A2,P1 78 NA
15:43:34.695063 Port Elp engaged A2,P1 A2,P0 78 NA
15:43:34.695079 *Removing all nodes from port A2,P0 A2,P0 78 NA
15:43:34.695304 SCN Port F_PORT A2,P1 A2,P0 78 NA
15:51:04.900869 SCN Port Offline;rsn=0x4,g=0x2e52 A2,P0 A2,P0 78 NA
15:51:04.900878 *Removing all nodes from port A2,P0 A2,P0 78 NA
15:51:04.913521 SCN LR_PORT(0);g=0x2e52 A2,P0 A2,P0 78 NA
15:51:04.986758 SCN Port Online; g=0x2e52,isolated=0 A2,P0 A2,P1 78 NA
15:51:04.986848 Port Elp engaged A2,P1 A2,P0 78 NA
15:51:04.986862 *Removing all nodes from port A2,P0 A2,P0 78 NA
15:51:04.987210 SCN Port F_PORT A2,P1 A2,P0 78 NA
Sfpshow表示Rx 和 Tx 处于最佳范围内。
Slot 9/Port 11:
=============
RX Power: -2.4 dBm (573.4uW)
TX Power: -1.0 dBm (795.1 uW)
- 在
errdump日志中,规则defALL_TARGET_PORTSSTATE_CHG_3以及defRD_1stDATA_TIME_11000和defRD_STATUS_TIME_12000规则生成,表明端口状态每分钟变化超过 3 次。
2023/01/28-16:03:41, [MAPS-1003], 187624, SLOT 1 | FID 128, WARNING, Switch_Name, slot9 port14, F-Port 9/14, Condition=ALL_TARGET_PORTS(STATE_CHG/min>3), Current Value:[STATE_CHG, 4], RuleName=defALL_TARGET_PORTSSTATE_CHG_3, Dashboard Category=Port Health.2023/04/24-19:37:30 (IST),[MAPS-1003],193109,SLOT 2 | FID 128,警告,Switch_Name,流 (SID=0x665641,DID=0x663440,Host Port=10/6),条件=sys_flow_monitor_scsi(RD_1stDATA_TIME/10SEC>11000),当前值:[RD_1stDATA_TIME, 12952 微秒],RuleName=defRD_1stDATA_TIME_11000,仪表盘类别=IO 延迟,静默时间=10 分钟。
2023/01/28-16:04:05, [MAPS-1003], 187625, SLOT 1 | FID 128, WARNING, Switch_Name, slot9 port14, F-Port 9/14, Condition=ALL_OTHER_F_PORTS(STATE_CHG/min>5), Current Value:[STATE_CHG, 6], RuleName=defALL_OTHER_F_PORTSSTATE_CHG_5, Dashboard Category=Port Healt
2023/01/28-16:04:53, [MAPS-1003], 187626, SLOT 1 | FID 128, WARNING, Switch_Name, slot9 port14, F-Port 9/14, Condition=ALL_TARGET_PORTS(STATE_CHG/min>3), Current Value:[STATE_CHG, 4], RuleName=defALL_TARGET_PORTSSTATE_CHG_3, Dashboard Category=Port Health.
2023/04/24-19:37:30 (IST), [MAPS-1003], 193110, SLOT 2 | FID 128, WARNING, Switch_Name, Flow (SID=0x665641,DID=0x663440,Host Port=10/6), Condition=sys_flow_monitor_scsi(RD_STATUS_TIME/10SEC>12000), Current Value:[RD_STATUS_TIME, 12953 Microseconds], RuleName=defRD_STATUS_TIME_12000, Dashboard Category=IO Latency, Quiet Time=10 min.
Frame timeout事件在交换机端的errdump日志中报告,说明哪个端口接收到帧(rx)以及无法在何处传输(tx),以及link reset事件:
2024/02/25-02:26:06 (GMT), [AN-1014], 588, FID 128, INFO, switch, Frame timeout detected, tx port 4 rx port 20, sid 500xx, did 704zz, timestamp 2024-02-25 02:26:06 .
2024/02/25-02:26:07 (GMT), [AN-1014], 608, FID 128, INFO, switch, Frame timeout detected, tx port 4 rx port 0, sid 700xx, did 704zz, timestamp 2024-02-25 02:26:07 .
2024/02/25-02:26:08 (GMT), [C4-1014], 628, CHASSIS | PORT 0/4, WARNING, switch, Link Reset on Port S0,P4(22) vc_no=0 crd(s)lost=6 auto trigger.