插槽8上的核心刀片式服务器出现故障、因为刀片式服务器是DOA
适用场景
- Brocade X7-8交换机
问题描述
- 安装交换机时、插槽8上的刀片式服务器 未联机。
- 插槽8上的核心刀片式服务器无法启动且处于故障状态。
- HA处于同步状态、交换机处于联机状态。
/fabos/cliexec/hadump:---------------------------------------TIME_STAMP: Dec 19 07:56:28.729073---------------------------------------Local CP (Slot 1, CP0): Active, Cold RecoveredRemote CP (Slot 2, CP1): Standby, HealthyHA enabled, Heartbeat Up, HA State synchronized- 已尝试重新启动交换机并物理重新拔插插插插槽8上的刀片式服务器、但问题描述仍然存在。
- 安装期间报告了以下事件-
Switch:FID128:admin> slotshowSlot Blade Type ID Status-----------------------------------1 CP BLADE 220 ENABLED2 CP BLADE 220 ENABLED3 SW BLADE 218 ENABLED4 SW BLADE 218 ENABLED5 SW BLADE 218 ENABLED6 SW BLADE 218 ENABLED7 CORE BLADE 215 ENABLED8 UNKNOWN VACANT9 SW BLADE 218 ENABLED10 SW BLADE 218 ENABLED11 SW BLADE 218 ENABLED12 SW BLADE 218 ENABLEDDIAG: Command /fabos/cliexec/forceerror invalid on Slot 82024/12/18-22:29:04 (GMT), [EM-1004], 76, SLOT 1 | FFDC | CHASSIS, CRITICAL, Brocade_X7-8, Slot 8 failed to power on.2024/12/18-22:29:04 (GMT), [RAS-1001], 77, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, First failure data capture (FFDC) event occurred.2024/12/18-22:29:04 (GMT), [EM-1069], 78, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered off.Switch:FID128:admin> ipaddrset -cp 0slotshowSlot Blade Type ID Status-----------------------------------1 CP BLADE 220 ENABLED2 CP BLADE 220 ENABLED3 SW BLADE 218 ENABLED4 SW BLADE 218 ENABLED5 SW BLADE 218 ENABLED6 SW BLADE 218 ENABLED7 CORE BLADE 215 ENABLED8 CORE BLADE 215 FAULTY (3)9 SW BLADE 218 ENABLED10 SW BLADE 218 ENABLED11 SW BLADE 218 ENABLED12 SW BLADE 218 ENABLED- 以下事件会在
errdump下报告-
2024/12/18-22:19:50 (GMT), [EM-2003], 28, SLOT 1 | CHASSIS, ERROR, Brocade_X7-8, Slot 8 has failed the POST tests. FRU is being faulted.2024/12/18-22:19:58 (GMT), [MAPS-1021], 33, SLOT 1 | FID 128, WARNING, Switch, RuleName=defCHASSISHA_SYNC_0, Condition=CHASSIS(HA_SYNC/NONE==0), Obj:Chassis [HA_SYNC,0] has contributed to switch status MARGINAL.2024/12/18-22:19:58 (GMT), [MAPS-1020], 34, SLOT 1 | FID 128, WARNING, Switch, Switch wide status has changed from HEALTHY to MARGINAL.2024/12/18-22:20:00 (GMT), [MAPS-1021], 35, SLOT 1 | FID 128, WARNING, Switch, RuleName=defCHASSISFAULTY_BLADE_1, Condition=CHASSIS(FAULTY_BLADE/NONE>=1), Obj:Chassis [FAULTY_BLADE,1] has contributed to switch status MARGINAL.2024/12/18-22:20:00 (GMT), [MAPS-1021], 36, SLOT 1 | FID 128, WARNING, Switch, RuleName=defCHASSISDOWN_CORE_1, Condition=CHASSIS(DOWN_CORE/NONE>=1), Obj:Chassis [DOWN_CORE,1] has contributed to switch status MARGINAL.Line 149: 2024/12/18-22:25:25 (GMT), [EM-1050], 69, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 removal detected.Line 151: 2024/12/18-22:28:31 (GMT), [EM-1029], 70, SLOT 1 | CHASSIS, WARNING, Brocade_X7-8, Slot 8, a problem occurred accessing a device on the I2C bus (-4). Operational status (40) not changed, access is being retried.Line 153: 2024/12/18-22:28:31 (GMT), [EM-1029], 71, SLOT 1 | CHASSIS, WARNING, Brocade_X7-8, Slot 8, a problem occurred accessing a device on the I2C bus (-4). Operational status (40) not changed, access is being retried.Line 155: 2024/12/18-22:28:32 (GMT), [EM-1049], 72, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 insertion detected.Line 157: 2024/12/18-22:28:32 (GMT), [EM-1050], 73, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 removal detected.Line 159: 2024/12/18-22:28:52 (GMT), [EM-1049], 74, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, FRU Slot 8 insertion detected.Line 161: 2024/12/18-22:28:52 (GMT), [EM-1070], 75, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered on.Line 163: 2024/12/18-22:29:04 (GMT), [EM-1004], 76, SLOT 1 | FFDC | CHASSIS, CRITICAL, Brocade_X7-8, Slot 8 failed to power on.Line 167: 2024/12/18-22:29:04 (GMT), [EM-1069], 78, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered off.Line 193: 2024/12/19-07:45:59 (GMT), [EM-1223], 92, SLOT 1 | FID 128, INFO, FNS1940V023, Slot 8 has been powered on.Line 195: 2024/12/19-07:46:00 (GMT), [EM-1070], 93, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered on.Line 197: 2024/12/19-07:46:12 (GMT), [EM-1004], 94, SLOT 1 | FFDC | CHASSIS, CRITICAL, Brocade_X7-8, Slot 8 failed to power on.2024/12/19-07:46:12 (GMT), [RAS-1001], 95, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, First failure data capture (FFDC) event occurred.Line 201: 2024/12/19-07:46:12 (GMT), [EM-1069], 96, SLOT 1 | CHASSIS, INFO, Brocade_X7-8, Slot 8 is being powered off.- 为了进一步隔离问题描述、需要执行以下步骤-
- 对插槽8上的刀片式服务器执行物理重新拔插 、然后重新启动 交换机。
- 如果问题描述仍然存在、则需要确保刀片 在机箱中正确就位、检查这一点的唯一方法是在机箱正面放置一个平坦的设备、例如使用金属边尺。
- 如果任何刀片式服务器未完全‘平齐’或未平放在机箱表面,则表示刀片式服务器未正确就位。
- ‘/现场工程师需要检查机箱正面的顶部、中间和底部,以确保刀片式服务器完全“平坦”。如果标尺显示有任何迹象表明所有刀片均未平齐、则表示刀片未正确就位。
- 在这种情况下、我们需要关闭机箱电源、然后小心地卸下并重新安装刀片式服务器、使其完全平放在正面。