AFF A900紧急关闭:环境原因关闭(温度严重)
适用场景
AFF A900
问题描述
- 与高温相关的数据中心环境问题描述。
- 一个节点由于在正常温度之外运行而报告ONTAP事件消息。示例:
[node_name-2: env_mgr: monitor.chassisTemperature.warm:alert]: Chassis temperature is too warm: Bat Ambient 2 is warning high (42 C).
[node_name-2: env_mgr: monitor.chassisTemperature.warm:alert]: Chassis temperature is too warm: Bat Ambient 1 is warning high (42 C).
[node_name-2: monitor: monitor.globalStatus.critical:EMERGENCY]: Chassis temperature is too high..
[node_name-2: env_mgr: callhome.chassis.hitemp:error]: Call home for CHASSIS OVER TEMPERATURE
- 配对节点报告相同的Temperature ONTAP事件消息。示例:
[node_name-1: env_mgr: monitor.shutdown.chassisOverTemp:EMERGENCY]: Chassis temperature is too hot: Multiple Temp sensors are too high. System will be shutdown in 2 minutes
[node_name-1: env_mgr: callhome.chassis.overtemp:EMERGENCY]: Call home for CHASSIS OVER TEMPERATURE SHUTDOWN
[node_name-1: env_mgr: monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Temperature critical)
- 节点无法使用启动。示例:
Boot Loader version 6.6.4
Copyright (C) 2000-2003 Broadcom Corporation.
Portions Copyright (C) 2002-2022 NetApp, Inc. All Rights Reserved.
ACPI RSDP Found at 0x6f7fe014
BIOS POST Failure(s) detected: PCIe device missing error detected. Abort AUTOBOOT
- 该节点的BMC事件:
Record 2486: Fri Jan 19 05:50:28.000000 2024 [SysFW.notice]: Device 47/0/0 (SW0-VS1-P40) missing
Record 2487: Fri Jan 19 05:50:28.000000 2024 [SysFW.notice]: Device 74/0/0 (SW0-VS0-P32) missing
Record 2488: Fri Jan 19 05:50:43.000000 2024 [Boot Loader.critical]: Abort Autoboot due to BIOS POST failure.
- 节点重新定位后、问题描述仍会保留
- 重新拔插/换用PCIe卡后、问题描述仍然存在。