跳转到主内容

为什么在发生多个磁盘故障后聚合会脱机

Views:
16
Visibility:
Public
Votes:
0
Category:
fas-systems
Specialty:
hw
Last Updated:

适用场景

  • ONTAP 9
  • AFF
  • FAS

问题解答

多个磁盘故障的数量超过 RAID 容错阈值
  • RAID4 中的 2 个或更多
  • RAID-DP 中为 3 个或更多
  • RAID-TEC 中的 4 个或更多

RAID-DP 示例:

Cluster::> run -node Node-01 sysconfig -r

  • 之前:
RAID group Aggr1/plex0/rg1 (normal, block checksums)

RAID Disk    Device      HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------    ------      ------------- ---- ---- ---- ----- --------------    --------------
dparity     0a.12.16    0a    12  16  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
parity      0a.12.17    0a    12  17  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
...
data        0a.11.6     0a    11  6   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.7     0a    11  7   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.8     0a    11  8   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.9     0a    11  9   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.10    0a    11  10  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.19    0a    11  19  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.12    0a    11  12  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.13    0a    11  13  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.14    0a    11  14  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.15    0a    11  15  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
...
  • 两个磁盘出现故障:
RAID group Aggr1/plex0/rg1 (double degraded, block checksums)

RAID Disk    Device      HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------    ------      ------------- ---- ---- ---- ----- --------------    --------------
dparity     0a.12.16    0a    12  16  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
parity      0a.12.17    0a    12  17  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
...
data        0a.11.6     0a    11  6   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data   FAILED          N/A                        857000/ -
data        0a.11.8     0a    11  8   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.9     0a    11  9   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.10    0a    11  10  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.19    0a    11  19  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.12    0a    11  12  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.13    0a    11  13  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data   FAILED          N/A                        857000/ -
data        0a.11.15    0a    11  15  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
...
 
  • 两个以上磁盘故障:

Aggregate Aggr1 (failed, raid_dp, partial) (block checksums)
  Plex /Aggr1/plex0 (offline, failed, inactive)
    RAID group /Aggr1/plex0/rg1 (normal, block checksums)

RAID Disk    Device      HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------    ------      ------------- ---- ---- ---- ----- --------------    --------------
dparity     0a.12.16    0a    12  16  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
parity      0a.12.17    0a    12  17  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
...
data        0a.11.6     0a    11  6   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data    FAILED          N/A                        857000/ -
data        0a.11.11    0a    11  11  SA:B   0   SAS 15000 857000/1755136000 858483/1758174768 (reconstruct stalled)
data        0a.11.9     0a    11  9   SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.10    0a    11  10  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data    FAILED          N/A                        857000/ -
data        0a.11.12    0a    11  12  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data        0a.11.13    0a    11  13  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
data    FAILED          N/A                        857000/ -
data        0a.11.15    0a    11  15  SA:B   0   SAS 10000 857000/1755136000 858483/1758174768
...
Raid group is missing 3 disks.

追加信息

附加信息 _text
NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.