跳转到主内容

聚合由于多个故障磁盘而脱机

Views:
13
Visibility:
Public
Votes:
0
Category:
disk-drives<a>2009-206623</a>
Specialty:
hw
Last Updated:

适用场景

  • ONTAP 8.
  • ONTAP 9
  • FAS/AFF 系统

问题描述

  • 聚合由于多个故障磁盘而脱机:

Cluster::> system node run -node <node-name> sysconfig -r

Aggregate aggr1 (failed, raid_dp, partial) (block checksums)
  Plex /aggr1/plex0 (offline, failed, inactive)
  RAID group /aggr1/plex0/rg1 (partial, block checksums)

      RAID Disk    Device      HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      ---------    ------      ------------- ---- ---- ---- ----- --------------    --------------
      dparity     0a.01.0     0a    1   0   SA:A   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      parity      FAILED          N/A                        3807816/ -
      data        0b.01.2     0b    1   2   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.3     0b    1   3   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.4     0b    1   4   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.5     0b    1   5   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.6     0b    1   6   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.7     0b    1   7   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.8     0b    1   8   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.9     0b    1   9   SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        FAILED          N/A                        3807816/ -
      data        FAILED          N/A                        3807816/ -
      data        FAILED          N/A                        3807816/ -
      data        0b.01.13    0b    1   13  SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.14    0b    1   14  SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        0b.01.15    0b    1   15  SA:B   0  FSAS  7200 3807816/7798408704 3815447/7814037168 
      data        FAILED          N/A                        3807816/ -
      Raid group is missing 5 disks.

  • 事件日志中会显示以下磁盘故障警报:

[Node-01:scsi.cmd.checkCondition:debug]: Disk device 0b.01.10: Check Condition: CDB 0x1b: Sense Data SCSI:not ready -  (0x2 - 0x4 0x0 0x0)(0).  [Node-01:disk.init.failure.spinup:error]: Disk 0b.01.10 has failed to spin up and cannot be used. Please replace it with a new drive. 
[Node-01:callhome.dsk.no.spin:ALERT]: Call home for DISK NOT SPINNING 
[Node-01:disk.init.failure.error:warning]: Disk 0b.01.10 failed initialization due to error 5.
[Node-01:disk.readReservationFailed:error]: Disk read reservation failed on 0b.01.10 CDB 0x5e:01 - SCSI:not ready (2 4 0)
[Node-01:diskown.errorDuringIO:error]: error 19 (disk not ready for requested operation) on disk 0b.01.10 (S/N ) while reading reservation state 
[Node-01:disk.ioFailed:error]: I/O operation failed despite several retries.  

[Node-01:raid.config.disk.failed:error]: Disk 0b.01.16 Shelf 1 Bay 16 [NETAPP   X477_SMEGX04TA07 NA02] S/N [XXXXXXXX] failed. 
[Node-01:callhome.dsk.fault:error]: Call home for DISK FAILED 
[Node-01:raid.fdr.reminder:warning]: Failed Disk 0b.01.16 Shelf 1 Bay 16 [NETAPP   X477_SMEGX04TA07 NA02] S/N [XXXXXXXX] is still present in the system and should be removed. 
[Node-01:diskown.errorReadingOwnership:warning]: error 3 (disk failed) while reading ownership on disk 0b.01.16 (S/N XXXXXXX) 

[Node-02:disk.init.failureBytes:warning]: Failed disk 0b.01.17 detected during disk initialization

  • 可以在事件日志中为聚合报告以下丛故障事件:

[Node-01:raid.assim.disk.brokenPreAssim:error]: Broken Disk 0b.01.1 Shelf 1 Bay 1 [NETAPP   X477_SMEGX04TA07 NA02] S/N [XXXXXXXX] detected prior to assimilation. 
[Node-01:raid.assim.rg.missingChild:error]: Aggregate aggr1, rgobj_verify: RAID object 1 has only 13 valid children, expected 16. 
[Node-01:raid.assim.plex.missingChild:error]: Aggregate aggr1, plexobj_verify: Plex 0 only has 1 working RAID groups (2 total) and is being taken offline 
[Node-01:raid.assim.mirror.noChild:ALERT]: Aggregate aggr1, mirrorobj_verify: No operable plexes found. 

[Node-01:raid.rg.recons.missing:notice]: RAID group /agg2/plex0/rg0 is missing 1 disk(s). 
[Node-01:raid.rg.recons.cantStart:warning]: The reconstruction cannot start in RAID group /agg2/plex0/rg0: No matching disks available in spare pool, targeting any spare pool

 

 

 

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

Scan to view the article on your device

 

  • 这篇文章对您有帮助吗?