Brocade交换机上报告defall_ports_oversubscribed事件
适用场景
Brocade交换机
问题描述
defALL_PORTS_OVERSUBSCRIBED
、defALL_PORTS_OVERSUBSCRIPTION_CLEAR
以及 在errdump
日志下报告的F端口上的TXQ
延迟事件:
2022/05/09-14:19:59, [MAPS-3038], 9094, SLOT 2 | FID 128, WARNING, FAB2, slot11 port6, F-Port 11/6, Condition=ALL_PORTS(PORT_BANDWIDTH==OVERSUBSCRIBED), Current Value:[PORT_BANDWIDTH, OVERSUBSCRIBED, (TXQL=7 us, TX=72.2%) ], RuleName=defALL_PORTS_OVERSUBSCRIBED, Dashboard Category=Fabric Performance Impact.
2022/05/09-14:19:59, [MAPS-3038], 9095, SLOT 2 | FID 128, WARNING, FAB2, slot4 port7, F-Port 4/7, Condition=ALL_PORTS(PORT_BANDWIDTH==OVERSUBSCRIBED), Current Value:[PORT_BANDWIDTH, OVERSUBSCRIBED, (TXQL=6 us, TX=70.2%) ], RuleName=defALL_PORTS_OVERSUBSCRIBED, Dashboard Category=Fabric Performance Impact.
2022/05/09-14:20:59, [MAPS-3039], 9096, SLOT 2 | FID 128, INFO, FAB2, slot11 port6, F-Port 11/6, Condition=ALL_PORTS(PORT_BANDWIDTH==OVERSUBSCRIPTION_CLEAR), Current Value:[PORT_BANDWIDTH, OVERSUBSCRIPTION_CLEAR], RuleName=defALL_PORTS_OVERSUBSCRIPTION_CLEAR, Dashboard Category=Fabric Performance Impact.
2022/05/09-14:20:59, [MAPS-3039], 9097, SLOT 2 | FID 128, INFO, FAB2, slot4 port7, F-Port 4/7, Condition=ALL_PORTS(PORT_BANDWIDTH==OVERSUBSCRIPTION_CLEAR), Current Value:[PORT_BANDWIDTH, OVERSUBSCRIPTION_CLEAR], RuleName=defALL_PORTS_OVERSUBSCRIPTION_CLEAR, Dashboard Category=Fabric Performance Impact.
defALL_HOST_PORTSTX_95
在错误转储日志中生成规则:
2022/12/18-05:19:50, [MAPS-1005], 7594245, SLOT 1 | FID 128, WARNING, NETM_X7_112_FAB2, slot10 port41-H.LIN.UPI-DB1.10.225.229.151.G7BY6T2.HBUPIDRDB1_40, F-Port 10/41, Condition=ALL_HOST_PORTS(TX/min>95.00), Current Value:[TX, 95.38 %], Rule defALL_HOST_PORTSTX_95 triggered 24 times in 1 hour and last trigger time Sun Dec 18 04:48:52 2022, Dashboard Category=Fabric Performance Impact.
defPORT_MAX_WR_PENDING_IO_250
在EMS日志中观察到的警报:
2024/07/16-23:00:29 (IST), [MAPS-1003], 352645, SLOT 1 | FID 128 | PORT 11/14, WARNING, Sxxy_X7_1xx_Fxx2, slot11 port14-Storage.E.UPI_5600_40850_CL8F.(Dedicated_Linux).3F, F-Port 11/14, Condition=ALL_TARGET_PORTS(MAX_WR_PENDING_IO/10SEC>250), Current Value:[PORT_MAX_WR_PENDING_IO, 251 IOs], RuleName=defPORT_MAX_WR_PENDING_IO_250, Dashboard Category=IO Latency, Quiet Time=10 min.
2024/07/16-23:10:49 (IST), [MAPS-1005], 352694, SLOT 1 | FID 128 | PORT 11/14, WARNING, Sxxy_X7_1xx_Fxx2, slot11 port14-Storage.E.UPI_5600_40850_CL8F.(Dedicated_Linux).3F, F-Port 11/14, Condition=ALL_TARGET_PORTS(MAX_WR_PENDING_IO/10SEC>250), Current Value:[PORT_MAX_WR_PENDING_IO, 253 IOs], Rule defPORT_MAX_WR_PENDING_IO_250 triggered 8 times in 10 min and last trigger time Tue Jul 16 23:10:29 2024, Dashboard Category=IO Latency.
- 高利用率不是网络结构故障、我们 在高利用率方面没有什么可帮助的、因为它不是可修复的故障
- 这是一种设计和/或使用问题描述、表示由于流量条件(负载需求)、当前利用率超出可用带宽
- 交换机处理来自连接到端口的终端设备的传入请求。因此、如果请求的工作负载在任何时间点的利用率较高且交换机端口上的TX利用率超过95%、则即 生成规则
defALL_HOST_PORTSTX_95
时 - 由于缓冲区信用值、在存储设备上观察到较长的响应时间