FAS2xxx端口e0a、e0b、e0M、e0P缺失、并显示"Invalid PCIe device detected"
状态信息
适用场景
- FAS25xx、FAS22xx、FAS2750
- 四通道千兆以太网控制器82580
问题描述
- 节点重新启动时不显示崩溃消息。关闭前可能会显示以下消息:
Mon Jul 5 09:13:39 CEST [node_name:netif.hangDetected:warning]: Network interface e0b hung (PCIe RcvMstAdt). Resetting to recover. Driver: igb.
Mon Jul 5 09:22:13 CEST [node_name:netif.hangDetected:warning]: Network interface e0a hung (PCIe RcvMstAdt). Resetting to recover. Driver: igb.
- 节点保持正常运行,但通过端口 e0a , e0b 和 e0M 在网络上停止响应。
- 从控制台日志(
system log
从 SP 或从控制台连接启动日志):
Mar 11 08:10:05 [XXX:pvif.allLinksDown:EMERGENCY]: ifgrp a0a: All links are down
kill: 85241: No such process
Terminated
.
Uptime: 311d17h23m44s
HALT: HA partner has taken over (ic) on Wed Mar 11 11:54:45 CET 2020
ugen0.2: <Micron Technology> at usbus0 (disconnected)
System rebooting... <==== the system rebooted, but didn't panic
================ Log #1 end time Wed Mar 11 10:54:44 2020
================ Log #2 start time Wed Mar 11 10:55:10 2020
Invalid PCIe device detected below PCIe Root Port(Bus/Dev/Func): 00/1C/00 <== the BIOS is not able to recognize some components
Actual Vendor ID and Device ID:FFFF/FFFF
Expected Vendor ID and Device ID:8086/150E
Mezzanine Card ID(02 - 10GbE, 03 - FC, 07 - No Dev, others - Resv):07
BIOS is resetting system...
- 如果节点可以启动,则端口 e0a , e0b , e0M 和 e0P 将完全丢失或在
Hardware Initialization Failed
sysconfig
输出中报告:
slot 0: Internal 10/100/1000 Ethernet Switch Status: Unknown
slot 0: Quad Gigabit Ethernet Controller 82580
e0a MAC Address: 00:00:00:00:00:00 (Hardware Initialization Failed: IGB: 3)
e0b MAC Address: 00:00:00:00:00:00 (Hardware Initialization Failed: IGB: 3)
e0M MAC Address: 00:00:00:00:00:00 (Hardware Initialization Failed: IGB: 3)
e0P MAC Address: 00:00:00:00:00:00 (Hardware Initialization Failed: IGB: 3)
slot 0: Interconnect HBA: Mellanox IB MT25204