NS224 的 NSM100 模块不可用
适用于
- 所有带有 NS224 磁盘架的 AFF 系统
- NSM100 磁盘架模块
问题描述
- 连接到同一 NSM100 模块的两个节点中的存储端口出现故障
[node_name-1: kernel: netif.linkDown:info]: Ethernet e5a: Link down, check cable.
[node_name-1: intr: netif.linkDown:info]: Ethernet e5a-30: Link down, check cable.
[node_name-1: scsi_cmdblk_strthr_admin: scsi.cmd.pastTimeToLive:error]: Disk device e5a.40.0.0L0: request failed after try #1: cdb 0xe2.
[node_name-2: kernel: netif.linkDown:info]: Ethernet e3b: Link down, check cable.
[node_name-2: intr: netif.linkDown:info]: Ethernet e3b-30: Link down, check cable.
[node_name-2: scsi_cmdblk_strthr_admin: scsi.cmd.pastTimeToLive:error]: Disk device e3b.40.1.12L0: request failed after try #1: cdb 0xe2.
- 报告丢失到 NS224 盘架的路径的事件消息:
[node_name-1: nchmd: hm.alert.raised:alert]: Alert Id = NoPathToNSMB_Alert , Alerting Resource = 012345678910111213 raised by monitor node-connect
[node_name-1: dsa_disc: shelf.config.tomixed:notice]: System has transitioned to a mixture of single path, multi-path or quad-path storage configurations.
或:
EventMessage = warning event on node_name-1 (Disk Shelf Multipath Not Configured)
ocumEventMessageDetails = Disk shelf has either one or no path connected to the node.
system health alert show
输出时没有结果:
Monitor node-connect
Alert ID NoPathToNSMA_Alert
Alerting Resource XXXXXXXXXXXXXXXXXXX
Subsystem SAS-connect
Indication Time Sun May 08 22:44:17 2022
Perceived Severity Major
Probable Cause Cable_tamper
Description Controller node_name-1 is connected only to module B of shelf 10.10 through port e3b.
Corrective Actions
1. Consult the guide applicable to your NVMe storage shelf to review cabling rules for your system.
2. Connect controller node_name-1 to module A and module B of shelf 10.10.
Possible Effect You will lose access to shelf 10.10 if module B fails.
system node run -node node_name-1 -command sysconfig -a
输出,显示一条下行路径和一个"缺失"模块:
slot 0: 40G/100G Ethernet Controller CX5
e0c MAC Address: 00:11:22:33:44:55 (auto-100g_cr4-fd-up)
QSFP Vendor: Molex
QSFP Part Number: 112-00574
QSFP Serial Number: 99A0123456789
00.0.0: NETAPP X4011S172B3T8NTE NA56 3662.5GB 4160B/sect (S99YNA9MB00001)
...
00.0.23: NETAPP X4011S172B3T8NTE NA56 3662.5GB 4160B/sect (S99YNA9MB00024)
Shelf 0: NS224NSM100 Firmware rev. NSM100 A: 0151 B: ----
slot 0: 40G/100G Ethernet Controller CX5
e0f MAC Address: d0:39:ea:15:0b:d0 (auto-unknown-fd-down)
QSFP Vendor: Molex
QSFP Part Number: 112-00574
QSFP Serial Number: 99A0123456780
- 系统从多路径 HA 转换到单路径 HA 配置当一个模块发生故障时:
::>storage shelf show -instance
Modules:
Monitor Is ... Operational Module
ID Part No. ES Serial No. is Active Master ... Status Location
--- ------------ ------------- --------- ------ ... ----------- --------------
a - - true false ... error rear of the shelf at the top, on module A
b 111-04256+B1 xxxxxxxxxxxx true true ... normal rear of the shelf at the bottom, on module B
- 传感器错误:
ses.status.dimm.error: NS224NSM100 (S/N XXXXXXXXXXXXXXX) shelf 1 on channel 0x
DIMM failure for Dimm Element 3: not installed or failed. This element is on the
DIMM slot 3 in the top shelf module (A).
ses.status.battery.error: NS224NSM100 (S/N XXXXXXXXXXXXXXX) shelf 1 on channel 0x battery failure error for Coin Battery 1: not installed or hardware failure. This element is on the rear of the shelf, in top module (A).
ses.status.electronicsError: NS224NSM100 (S/N XXXXXXXXXXXXXXX) shelf 1 on channel
0x environmental monitoring error for SES electronics 1 : not available ;
enclosure services hardware failed This element is on the rear of the shelf at
the top, on module A.