以太网端口关闭并出现 CRC 错误
适用于
- NetApp ONTAP 9.18.1 ASAR2
- ASA-A30(全 SAN 阵列)
- 接口组(ifgrp/LACP)中的以太网端口(e1a,e1b)
- 近期更换硬件(交换机)的站点
- 使用巨型帧(MTU 9000)的环境
问题
问题:两个 HA 节点上的以太网端口(e1a 和 e1b)间歇性关闭,报告大量的 CRC 错误和降级的接口组状态。更换交换机后开始出现问题。可以通过禁用和重新启用端口来恢复端口,但当 MTU 设置为 9000 时,CRC 错误仍然存在。将 MTU 降低到 1500 会立即停止 CRC 错误。
示例日志输出:
[?] Wed Apr 01 15:15:07 +0000 [node-02: intr: net.ifgrp.lacp.link.inactive:error]: ifgrp a0a, port e1b has transitioned to an inactive state. The interface group is in a degraded state. [?] Wed Apr 01 16:15:22 +0000 [node-02: clock: net.ifgrp.lacp.link.active:notice]: ifgrp a0a, port e1b has transitioned to the active state. [?] Wed Apr 01 21:15:07 +0000 [node-02: intr: net.ifgrp.lacp.link.inactive:error]: ifgrp a0a, port e1b has transitioned to an inactive state. The interface group is in a degraded state.Reason [?] Wed Apr 01 15:15:07 +0000 [node-02: mgmt_port_link_status_poll: netif.linkDown:info]: Ethernet e1b: Link down, check cable. [?] Wed Apr 01 15:15:07 +0000 [node-02: intr: net.ifgrp.lacp.link.inactive:error]: ifgrp a0a, port e1b has transitioned to an inactive state. The interface group is in a degraded state. [?] Wed Apr 01 15:15:07 +0000 [node-02: vifmgr: vifmgr.portdown:notice]: A link down event was received on node node-02, port e1b. [?] Wed Apr 01 15:15:07 +0000 [node-02: vifmgr: vifmgr.cluscheck.hwerrors:alert]: Port e1b on node node-02 is reporting a high number (at least 1 per 1000 packets) of observed hardware errors (CRC, length, alignment, dropped). [?] Wed Apr 01 15:15:07 +0000 [node-02: vifmgr: vifmgr.port.monitor.failed:error]: The "crc_errors" health check for port e1b (node-02) has failed. The port is operating in a degraded state.and after that flapping [?] Sun Apr 05 18:15:35 +0000 [node-02: mgmt_port_link_status_poll: netif.linkUp:info]: Ethernet e1b: Link up. [?] Sun Apr 05 18:15:35 +0000 [node-02: vifmgr: vifmgr.portup:notice]: A link up event was received on node node-02, port e1b. [?] Sun Apr 05 18:15:41 +0000 [node-02: clock: net.ifgrp.lacp.link.active:notice]: ifgrp a0a, port e1b has transitioned to the active state. [?] Sun Apr 05 18:16:44 +0000 [node-02: vifmgr: vifmgr.reach.ok:notice]: Network port e1b on node node-02 can reach its expected broadcast domain Default:Replication. No other broadcast domains appear to be reachable from this port. [?] Mon Apr 06 00:15:20 +0000 [node-02: mgmt_port_link_status_poll: netif.linkDown:info]: Ethernet e1b: Link down, check cable. [?] Mon Apr 06 00:15:20 +0000 [node-02: intr: net.ifgrp.lacp.link.inactive:error]: ifgrp a0a, port e1b has transitioned to an inactive state. The interface group is in a degraded state. [?] Mon Apr 06 00:15:20 +0000 [node-02: vifmgr: vifmgr.portdown:notice]: A link down event was received on node-02, port e1b.
RECEIVE Total frames: 7182k | Frames/second: 0 | Total bytes: 850m Bytes/second: 0 | Total errors: 2394 | Errors/minute: 0 <<<<< CRC errors from environment Total discards: 0 | Discards/minute: 0 | Multi/broadcast: 1714k Non-primary u/c: 0 | Errored frames: 0 | Unsupported Op: 0 CRC errors: 2394 | Runt frames: 0 | Fragment: 0 Long frames: 0 | Jabber: 0 | Length errors: 0 Alignment errors: 0 | No buffer: 0 | Pause: 0 Jumbo: 0 | Error symbol: 0 | Bus overruns: 0 <<<<< no local errors Queue drops: 0 | LRO segments: 5466k | LRO bytes: 646m LRO6 segments: 0 | LRO6 bytes: 0 | Bad UDP cksum: 0 Bad UDP6 cksum: 0 | Bad TCP cksum: 0 | Bad TCP6 cksum: 0 Mcast v6 solicit: 0 | Lagg errors: 0 | Lacp errors: 0 Lacp PDU errors: 0 TRANSMIT Total frames: 16121k | Frames/second: 0 | Total bytes: 4743m Bytes/second: 0 | Total errors: 0 | Errors/minute: 0 Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 1759k Collisions: 0 | Pause: 0 | Jumbo: 1562 Cfg Up to Downs: 6 | TSO segments: 207k | TSO bytes: 2824m TSO6 segments: 0 | TSO6 bytes: 0 | HW UDP cksums: 0 HW UDP6 cksums: 0 | HW TCP cksums: 0 | HW TCP6 cksums: 0 Mcast v6 solicit: 0 | Lagg drops: 0 | Lagg no buffer: 0 Lagg no entries: 0 | Tx No Buf: 0 DEVICE Mcast addresses: 3 | Rx MBuf Sz: 4096 LINK INFO Speed: 0 | Duplex: full | Flowcontrol: full Media state: no carrier | Up to downs: 12 | HW assist: 514k <<<<< Down with flapping -- interface e1b (40 days, 10 hours, 4 minutes, 55 seconds) -- RECEIVE Total frames: 47775k | Frames/second: 14 | Total bytes: 5922m Bytes/second: 1696 | Total errors: 5018 | Errors/minute: 0 <<<<<< CRC errors Total discards: 0 | Discards/minute: 0 | Multi/broadcast: 3027k Non-primary u/c: 0 | Errored frames: 0 | Unsupported Op: 0 CRC errors: 5018 | Runt frames: 0 | Fragment: 0 Long frames: 0 | Jabber: 0 | Length errors: 0 Alignment errors: 0 | No buffer: 0 | Pause: 0 Jumbo: 0 | Error symbol: 0 | Bus overruns: 0 <<<<<< no local errors Queue drops: 0 | LRO segments: 44733k | LRO bytes: 5621m LRO6 segments: 0 | LRO6 bytes: 0 | Bad UDP cksum: 0 Bad UDP6 cksum: 0 | Bad TCP cksum: 0 | Bad TCP6 cksum: 0 Mcast v6 solicit: 0 | Lagg errors: 0 | Lacp errors: 0 Lacp PDU errors: 0 TRANSMIT Total frames: 47104k | Frames/second: 13 | Total bytes: 14245m Bytes/second: 4079 | Total errors: 0 | Errors/minute: 0 Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 2113k Collisions: 0 | Pause: 0 | Jumbo: 4387 Cfg Up to Downs: 0 | TSO segments: 631k | TSO bytes: 8605m TSO6 segments: 0 | TSO6 bytes: 0 | HW UDP cksums: 0 HW UDP6 cksums: 0 | HW TCP cksums: 0 | HW TCP6 cksums: 0 Mcast v6 solicit: 0 | Lagg drops: 0 | Lagg no buffer: 0 Lagg no entries: 0 | Tx No Buf: 0 DEVICE Mcast addresses: 3 | Rx MBuf Sz: 4096 LINK INFO Speed: 10000M | Duplex: full | Flowcontrol: full Media state: active | Up to downs: 7 | HW assist: 514k <<<<<<< Active
将 MTU 降低到 1500 后,CRC 错误停止,端口保持稳定。