插槽8上的核心刀片式服务器出现故障、因为刀片式服务器是DOA
适用场景
- Brocade X7-8交换机
问题描述
- 安装交换机时、插槽8上的刀片式服务器 未联机。
- 插槽8上的核心刀片式服务器无法启动且处于故障状态。
- HA处于同步状态、交换机处于联机状态。
/fabos/cliexec/hadump:
---------------------------------------
TIME_STAMP: Dec 19 07:56:28.729073
---------------------------------------
Local CP (Slot 1, CP0): Active, Cold Recovered
Remote CP (Slot 2, CP1): Standby, Healthy
HA enabled, Heartbeat Up, HA State synchronized
- 已尝试重新启动交换机并物理重新拔插插插插槽8上的刀片式服务器、但问题描述仍然存在。
- 安装期间报告了以下事件-
Switch:FID128:admin> slotshow
Slot Blade Type ID Status
-----------------------------------
1 CP BLADE 220 ENABLED
2 CP BLADE 220 ENABLED
3 SW BLADE 218 ENABLED
4 SW BLADE 218 ENABLED
5 SW BLADE 218 ENABLED
6 SW BLADE 218 ENABLED
7 CORE BLADE 215 ENABLED
8 UNKNOWN VACANT
9 SW BLADE 218 ENABLED
10 SW BLADE 218 ENABLED
11 SW BLADE 218 ENABLED
12 SW BLADE 218 ENABLED
DIAG: Command /fabos/cliexec/forceerror invalid on Slot 8
2024/12/18-22:29:04 (GMT), [EM-1004], 76, SLOT 1 | FFDC | CHASSIS, CRITICAL, Brocade_X7-8, Slot 8 failed to power on.
2024/12/18-22:29:04 (GMT), [RAS-1001], 77, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, First failure data capture (FFDC) event occurred.
2024/12/18-22:29:04 (GMT), [EM-1069], 78, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered off.
Switch:FID128:admin> ipaddrset -cp 0slotshow
Slot Blade Type ID Status
-----------------------------------
1 CP BLADE 220 ENABLED
2 CP BLADE 220 ENABLED
3 SW BLADE 218 ENABLED
4 SW BLADE 218 ENABLED
5 SW BLADE 218 ENABLED
6 SW BLADE 218 ENABLED
7 CORE BLADE 215 ENABLED
8 CORE BLADE 215 FAULTY (3)
9 SW BLADE 218 ENABLED
10 SW BLADE 218 ENABLED
11 SW BLADE 218 ENABLED
12 SW BLADE 218 ENABLED
- 以下事件会在
errdump
下报告-
2024/12/18-22:19:50 (GMT), [EM-2003], 28, SLOT 1 | CHASSIS, ERROR, Brocade_X7-8, Slot 8 has failed the POST tests. FRU is being faulted.
2024/12/18-22:19:58 (GMT), [MAPS-1021], 33, SLOT 1 | FID 128, WARNING, Switch, RuleName=defCHASSISHA_SYNC_0, Condition=CHASSIS(HA_SYNC/NONE==0), Obj:Chassis [HA_SYNC,0] has contributed to switch status MARGINAL.
2024/12/18-22:19:58 (GMT), [MAPS-1020], 34, SLOT 1 | FID 128, WARNING, Switch, Switch wide status has changed from HEALTHY to MARGINAL.
2024/12/18-22:20:00 (GMT), [MAPS-1021], 35, SLOT 1 | FID 128, WARNING, Switch, RuleName=defCHASSISFAULTY_BLADE_1, Condition=CHASSIS(FAULTY_BLADE/NONE>=1), Obj:Chassis [FAULTY_BLADE,1] has contributed to switch status MARGINAL.
2024/12/18-22:20:00 (GMT), [MAPS-1021], 36, SLOT 1 | FID 128, WARNING, Switch, RuleName=defCHASSISDOWN_CORE_1, Condition=CHASSIS(DOWN_CORE/NONE>=1), Obj:Chassis [DOWN_CORE,1] has contributed to switch status MARGINAL.
Line 149: 2024/12/18-22:25:25 (GMT), [EM-1050], 69, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 removal detected.
Line 151: 2024/12/18-22:28:31 (GMT), [EM-1029], 70, SLOT 1 | CHASSIS, WARNING, Brocade_X7-8, Slot 8, a problem occurred accessing a device on the I2C bus (-4). Operational status (40) not changed, access is being retried.
Line 153: 2024/12/18-22:28:31 (GMT), [EM-1029], 71, SLOT 1 | CHASSIS, WARNING, Brocade_X7-8, Slot 8, a problem occurred accessing a device on the I2C bus (-4). Operational status (40) not changed, access is being retried.
Line 155: 2024/12/18-22:28:32 (GMT), [EM-1049], 72, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 insertion detected.
Line 157: 2024/12/18-22:28:32 (GMT), [EM-1050], 73, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 removal detected.
Line 159: 2024/12/18-22:28:52 (GMT), [EM-1049], 74, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 insertion detected.
Line 161: 2024/12/18-22:28:52 (GMT), [EM-1070], 75, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered on.
Line 163: 2024/12/18-22:29:04 (GMT), [EM-1004], 76, SLOT 1 | FFDC | CHASSIS, CRITICAL, Brocade_X7-8, Slot 8 failed to power on.
Line 167: 2024/12/18-22:29:04 (GMT), [EM-1069], 78, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered off.
Line 193: 2024/12/19-07:45:59 (GMT), [EM-1223], 92, SLOT 1 | FID 128, INFO, FNS1940V023, Slot 8 has been powered on.
Line 195: 2024/12/19-07:46:00 (GMT), [EM-1070], 93, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered on.
Line 197: 2024/12/19-07:46:12 (GMT), [EM-1004], 94, SLOT 1 | FFDC | CHASSIS, CRITICAL, Brocade_X7-8, Slot 8 failed to power on.
2024/12/19-07:46:12 (GMT), [RAS-1001], 95, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, First failure data capture (FFDC) event occurred.
Line 201: 2024/12/19-07:46:12 (GMT), [EM-1069], 96, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered off.
- 为了进一步隔离问题描述、需要执行以下步骤-
- 对插槽8上的刀片式服务器执行物理重新拔插 、然后重新启动 交换机。
- 如果问题描述仍然存在、则需要确保刀片 在机箱中正确就位、检查这一点的唯一方法是在机箱正面放置一个平坦的设备、例如使用金属边尺。
- 如果任何刀片式服务器未完全‘平齐’或未平放在机箱表面,则表示刀片式服务器未正确就位。
- ‘/现场工程师需要检查机箱正面的顶部、中间和底部,以确保刀片式服务器完全“平坦”。如果标尺显示有任何迹象表明所有刀片均未平齐、则表示刀片未正确就位。
- 在这种情况下、我们需要关闭机箱电源、然后小心地卸下并重新安装刀片式服务器、使其完全平放在正面。