在具有内部存储的系统中出现连续的RAID.disk.Replace.job.failed消息
适用场景
- FAS2820
- FAS2720
- FAS2750
- 完全填充的内部磁盘架(12个磁盘)
- 完全填充的外部磁盘架DS212-C (12个磁盘)
- 使用内部和外部磁盘架中的共享磁盘的数据聚合
- 内部磁盘架中没有可用的备用驱动器
问题描述
raid.disk.replace.job.failed
在尝试使用同一个内部备用磁盘时、两个节点都报告消息:
[node_name-01: mgwd: raid.disk.replace.job.start:debug]: Starting disk replacement of disk 2.2.7 with disk 2.1.10.
[node_name-01: mgwd: raid.disk.replace.job.failed:debug]: Failed to replace disk 2.2.7 with disk 2.1.10. Reason: Reserve: Unable to reserve disks on target node node_name-02.
[node_name-01: mgwd: mgmtgwd.jobmgr.jobcomplete.failure:debug]: Job "Disk Replace:2.2.7" [id 55] (Disk Replace) completed unsuccessfully: Reserve: Unable to reserve disks on target node node_name-02 (1).
[node_name-01: mgwd: raid.disk.replace.job.start:debug]: Starting disk replacement of disk 2.2.7 with disk 2.1.10.
[node_name-02: config_thread: raid.rg.spares.low:debug]: /data_aggr-2/plex0/rg0
[node_name-02: config_thread: callhome.spares.low:debug]: Call home for SPARES_LOW: /data_aggr-2/plex0/rg0
[node_name-02: mgwd: raid.disk.replace.job.start:notice]: Starting disk replacement of disk 2.2.8 with disk 2.1.10.
[node_name-02: mgwd: raid.disk.replace.job.failed:error]: Failed to replace disk 2.2.8 with disk 2.1.10. Reason: Reserve: Unable to reserve disks on target node node_name-02.
[node_name-02: mgwd: mgmtgwd.jobmgr.jobcomplete.failure:info]: Job "Disk Replace:2.2.8" [id 54] (Disk Replace) completed unsuccessfully: Reserve: Unable to reserve disks on target node node_name-02 (1).
[node_name-02: mgwd: raid.disk.replace.job.start:notice]: Starting disk replacement of disk 2.2.8 with disk 2.1.10.
RAID LM
日志在两个节点中持续报告:
...: ksmfContainerDiskReplace run -disk 2.2.7 -allow-mixing true -reason layout_optimization -replacement 2.1.10: succeeded
...: 2.2.7 needs to be copied to an internal disk.
...: 2.2.1 needs to be copied to an internal disk.
...: 2.2.5 needs to be copied to an internal disk.
...: 2.2.9 needs to be copied to an internal disk.
...: 2.2.3 needs to be copied to an internal disk.
...: ksmfContainerDiskReplace run -disk 2.2.7 -allow-mixing true -reason layout_optimization -replacement 2.1.10: succeeded
...: ksmfContainerDiskReplace run -disk 2.2.8 -allow-mixing true -reason layout_optimization -replacement 2.1.10: succeeded
...: 2.2.8 needs to be copied to an internal disk.
...: 2.2.2 needs to be copied to an internal disk.
...: 2.2.0 needs to be copied to an internal disk.
...: 2.2.6 needs to be copied to an internal disk.
...: 2.2.4 needs to be copied to an internal disk.
...: 2.2.10 needs to be copied to an internal disk.
...: ksmfContainerDiskReplace run -disk 2.2.8 -allow-mixing true -reason layout_optimization -replacement 2.1.10: succeeded
- 两个节点尝试获取 同一驱动器的所有权:
[node_name-01: sanown_io: diskown_changingOwner_1:debug]: params: {'diskname': '0c.01.10P1', 'serialno': '0AB12C3DEP001', 'oldownername': 'node_name-01', 'oldownerid': '111111111', 'oldhomeownerid': '111111111', 'olddrhomeownerid': '222222222', 'newownername': 'node_name-02', 'newownerid': '538302348', 'newhomeownerid': '538302348', 'newdrhomeownerid': '222222222', 'thread': 'svc_queue_thread', 'APIname': 'zapi_disk_sanown_assign'}
[node_name-01: sanown_io: diskown_changingOwner_1:debug]: params: {'diskname': '0c.01.10P1', 'serialno': '0AB12C3DEP001', 'oldownername': 'node_name-01', 'oldownerid': '111111111', 'oldhomeownerid': '111111111', 'olddrhomeownerid': '222222222', 'newownername': 'node_name-02', 'newownerid': '538302348', 'newhomeownerid': '538302348', 'newdrhomeownerid': '222222222', 'thread': 'svc_queue_thread', 'APIname': 'zapi_disk_sanown_assign'}
[node_name-02: sanown_io: diskown_changingOwner_1:notice]: params: {'newdrhomeownerid': '222222222', 'diskname': '0c.01.10P1', 'APIname': 'zapi_disk_sanown_assign', 'thread': 'svc_queue_thread', 'serialno': '0AB12C3DEP001', 'oldhomeownerid': '538302348', 'newownername': 'node_name-01', 'newownerid': '111111111', 'oldownerid': '538302348', 'oldownername': 'node_name-02', 'newhomeownerid': '111111111', 'olddrhomeownerid': '222222222'}
[node_name-02: sanown_io: diskown_changingOwner_1:notice]: params: {'newdrhomeownerid': '222222222', 'diskname': '0c.01.10P1', 'APIname': 'zapi_disk_sanown_assign', 'thread': 'svc_queue_thread', 'serialno': '0AB12C3DEP001', 'oldhomeownerid': '538302348', 'newownername': 'node_name-01', 'newownerid': '111111111', 'oldownerid': '538302348', 'oldownername': 'node_name-02', 'newhomeownerid': '111111111', 'olddrhomeownerid': '222222222'}
SYSCONFIG -R
显示的受影响磁盘分区的所有权与的输出不同storage disk show -partition-ownership.
SYSCONFIG -R
显示:node_name-01
Pool0 spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block checksum
...
spare 0b.01.10P2 0b 1 11 SA:A 0 FSAS 7200 63849/130764288 63857/130780672 (fast zeroed)
spare 0b.01.10P1 0b 1 11 SA:A 0 FSAS 7200 3743930/7667569664 3743938/7667586048 (fast zeroed)输出
storage disk show -partition-ownership
:
::> storage disk show -partition-ownership -disk 2.1.10
Disk Partition Home Owner Home ID Owner ID
-------- --------- ----------------- ----------------- ----------- -----------
2.1.10 Container node_name-01 node_name-01 5xxxxxxx1 5xxxxxxx1
Root node_name-01 node_name-01 5xxxxxxx1 5xxxxxxx1
Data node_name-02 node_name-02 5xxxxxxx2 5xxxxxxx2