MetroCluster 配置中的 Solaris 主机支持注意事项
适用场景
- MetroCluster 配置中的 Solaris 主机支持注意事项
- MetroCluster
- ONTAP 9
问题解答
默认情况下, Solaris OS 可以在“全路径下”( APD )中生存多达 20 秒;这由 fcp_offline_delay 参数控制。 parameter.
为了使 Solaris 主机在所有 MetroCluster 工作流(如协商切换、切换、拉断器计划外切换和自动计划外切换)期间继续运行而不会中断,建议将 fcp_offline_delay 设置为 120 秒。
MetroCluster 支持的重要注意事项:
主机对本地 HA 故障转移的响应 |
当 fcp_offline_delay 值增加时、应用程序服务恢复时间会在本地 HA 故障转移期间增加(例如节点出现紧急情况、然后继续节点接管泛节点)。 |
FCP 错误处理 |
如果默认值为 fcp_offline_delay 、当启动程序端口连接失败时、 FCP 驱动程序需要 110 秒来通知上层 (MPXIO) 。一旦将 fcp_offline_delay 增加到 120 秒、驱动程序通知上层 (MPXIO) 所需的总时间为 210 秒;这可能会导致 I/O 延迟。请参阅 Oracle 文档 ID : 1018952.1 。如果光纤通道端口出现故障、则在设备脱机之前可能会看到额外的 110 秒延迟。 |
与第三方阵列共存 |
由于 fcp_offline_delay 参数是全局参数,可能会影响与连接到 FCP 驱动程序的所有存储的交互。 |
如何修改 fcp_offline_delay 的设置。
对于 Solaris 10u8 、 10u9 、 10u10 和 10u11 : |
对于 Solaris 11 |
主机恢复示例:
如果发生灾难故障转移或计划外切换,并且所用时间异常长(超过 120 秒),从而可能导致主机应用程序出现发生原因故障,请在修复主机应用程序之前参见以下示例:
zpool 恢复:
确保所有 LUN 均已联机。
运行以下命令:
# zpool list
NAME SIZE ALLOC FREE CAP HEALTH ALTROOT
n_zpool_site_a 99.4G 1.31G 98.1G 1% OFFLINE -
n_zpool_site_b 124G 2.28G 122G 1% OFFLINE -
Check the individual pool status:
# zpool status n_zpool_site_b
pool: n_zpool_site_b
state: SUSPENDED ==============è>>>>>>>>>>>>>> POOL SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://www.sun.com/msg/ZFS-8000-HC
scan: none requested
config:
NAME STATE READ WRITE CKSUM
n_zpool_site_b UNAVAIL 1 1.64K 0 experienced I/O failures
c0t600A098051764656362B45346144764Bd0 UNAVAIL 1 0 0 experienced I/O failures
c0t600A098051764656362B453461447649d0 UNAVAIL 1 40 0 experienced I/O failures
c0t600A098051764656362B453461447648d0 UNAVAIL 0 38 0 experienced I/O failures
c0t600A098051764656362B453461447647d0 UNAVAIL 0 28 0 experienced I/O failures
c0t600A098051764656362B453461447646d0 UNAVAIL 0 34 0 experienced I/O failures
c0t600A09805176465657244536514A7647d0 UNAVAIL 0 1.03K 0 experienced I/O failures
c0t600A098051764656362B453461447645d0 UNAVAIL 0 32 0 experienced I/O failures
c0t600A098051764656362B45346144764Ad0 UNAVAIL 0 34 0 experienced I/O failures
c0t600A09805176465657244536514A764Ad0 UNAVAIL 0 1.03K 0 experienced I/O failures
c0t600A09805176465657244536514A764Bd0 UNAVAIL 0 1.04K 0 experienced I/O failures
c0t600A098051764656362B45346145464Cd0 UNAVAIL 1 2 0 experienced I/O failures
The above pool has degraded.
运行以下命令以清除池状态:
#zpool clear n_zpool_site_b
再次检查池:
# zpool status n_zpool_site_b
pool: n_zpool_site_b
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scan: none requested
config:
NAME STATE READ WRITE CKSUM
n_zpool_site_b ONLINE 0 0 0
c0t600A098051764656362B45346144764Bd0 ONLINE 0 0 0
c0t600A098051764656362B453461447649d0 ONLINE 0 0 0
c0t600A098051764656362B453461447648d0 ONLINE 0 0 0
c0t600A098051764656362B453461447647d0 ONLINE 0 0 0
c0t600A098051764656362B453461447646d0 ONLINE 0 0 0
c0t600A09805176465657244536514A7647d0 ONLINE 0 0 0
c0t600A098051764656362B453461447645d0 ONLINE 0 0 0
c0t600A098051764656362B45346144764Ad0 ONLINE 0 0 0
c0t600A09805176465657244536514A764Ad0 ONLINE 0 0 0
c0t600A09805176465657244536514A764Bd0 ONLINE 0 0 0
c0t600A098051764656362B45346145464Cd0 ONLINE 0 0 0
errors: 1679 data errors, use '-v' for a list
再次检查池状态;此处,池中的磁盘已降级。
[22] 05:44:07 (root@host1) /
# zpool status n_zpool_site_b -v
cannot open '-v': name must begin with a letter
pool: n_zpool_site_b
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scan: scrub repaired 0 in 0h0m with 0 errors on Fri Dec 4 05:44:17 2015
config:
NAME STATE READ WRITE CKSUM
n_zpool_site_b DEGRADED 0 0 0
c0t600A098051764656362B45346144764Bd0 ONLINE 0 0 0
c0t600A098051764656362B453461447649d0 ONLINE 0 0 0
c0t600A098051764656362B453461447648d0 ONLINE 0 0 0
c0t600A098051764656362B453461447647d0 ONLINE 0 0 0
c0t600A098051764656362B453461447646d0 ONLINE 0 0 0
c0t600A09805176465657244536514A7647d0 DEGRADED 0 0 0 too many errors
c0t600A098051764656362B453461447645d0 ONLINE 0 0 0
c0t600A098051764656362B45346144764Ad0 ONLINE 0 0 0
c0t600A09805176465657244536514A764Ad0 ONLINE 0 0 0
c0t600A09805176465657244536514A764Bd0 ONLINE 0 0 0
c0t600A098051764656362B45346145464Cd0 ONLINE 0 0 0
errors: No known data errors
运行以下命令以清除磁盘错误:
# zpool clear n_zpool_site_b c0t600A09805176465657244536514A7647d0
[24] 05:45:17 (root@host1) /
# zpool status n_zpool_site_b -v
cannot open '-v': name must begin with a letter
pool: n_zpool_site_b
state: ONLINE
scan: scrub repaired 0 in 0h0m with 0 errors on Fri Dec 4 05:44:17 2015
config:
NAME STATE READ WRITE CKSUM
n_zpool_site_b ONLINE 0 0 0
c0t600A098051764656362B45346144764Bd0 ONLINE 0 0 0
c0t600A098051764656362B453461447649d0 ONLINE 0 0 0
c0t600A098051764656362B453461447648d0 ONLINE 0 0 0
c0t600A098051764656362B453461447647d0 ONLINE 0 0 0
c0t600A098051764656362B453461447646d0 ONLINE 0 0 0
c0t600A09805176465657244536514A7647d0 ONLINE 0 0 0
c0t600A098051764656362B453461447645d0 ONLINE 0 0 0
c0t600A098051764656362B45346144764Ad0 ONLINE 0 0 0
c0t600A09805176465657244536514A764Ad0 ONLINE 0 0 0
c0t600A09805176465657244536514A764Bd0 ONLINE 0 0 0
c0t600A098051764656362B45346145464Cd0 ONLINE 0 0 0
errors: No known data errors
or export and import the zpool.
# zpool export n_zpool_site_b
# zpool import n_zpool_site_b
此池现已联机。
如果上述步骤不能恢复池、请重新引导主机。
存储虚拟机( SVM )( metaset )
确保所有 LUN 均联机、重新引导系统并装入存储虚拟机( SVM )。
追加信息
在此处添加您的文本。