节点卡在引导循环中导致上节点重新启动期间 ONTAP 升级
适用于
AFF-A250
ONTAP 升级
问题描述
- 在 ONTAP 升级过程中,一个节点无法正常重启并进入引导循环。
- 节点向托管磁盘托架的卡显示错误:
Device Bus:23 Dev:0 Fun:0 (slot 1) failed to train at max link speed/width
- 卡后面的磁盘出现的错误:
[node1:diskown.errorDuringIO:error]: error 23 (adapter error prevents command from being sent to device) on disk 1d.00.11 (S/N xxxxxxxx) while reading reservation state
[node1:raid.config.filesystem.disk.not.responding:notice]: File system Disk 1a.00.11 Shelf 0 Bay 11 [NETAPP X357_S163A3T8ATE NA54] S/N [xxxxxxxx] is not responding.
[node1:scsi.cmd.abortedByHost:error]: Unknown device 1d.00.11: Command aborted by host adapter: HA status 0x4: cdb 0x12.
- 在磁盘上工作或将节点引导到 ONTAP 时,上行节点会意外重新启动。 示例:
Node node2 encountered PANIC: aggr aggr0_node2: raid volfsm, fatal multi-disk error.
- 来自上节点的 EMS 日志显示 SK halt:
[node2: shutdown_thread0: kern.shutdown.initiator:debug]: SK halt was initiated by "maytag.ko::shutdown_appliance_real+8270"
- 更换卡和主板后问题仍然存在。
- 系统已确认有正确的电源。