环境停机(NVMEM.battery.capLowCrit)
适用场景
- FAS / AFF系统
- ONTAP 9
问题描述
- 节点关闭、 EMS中出现
nvmem.battery.capLowCrit
和nvmem.battery.fccLowCrit
警报。
[nodename:nvmem.battery.capLowCrit:EMERGENCY]: The NVMEM battery capacity is critically low (0 cycles). To prevent data loss, the system will shut down in 20 minutes.
[nodename:nvmem.battery.fccLowCrit:EMERGENCY]: The NVMEM battery full-charge capacity is critically low (25 %). To prevent data loss, the system will shut down in 20 minutes.
[nodename:callhome.battery.failure:EMERGENCY]: Call home for BATTERY (capacity low) CRITICAL.
[nodename:callhome.battery.failure:EMERGENCY]: Call home for BATTERY (full charge capacity low) CRITICAL.
[nodename:monitor.nvramLowBattery:EMERGENCY]: NVRAM battery is dangerously low.
[nodename:callhome.battery.low:ALERT]: Call home for BATTERY_LOW.
[nodename:monitor.shutdown.nvramLowBattery.pending:ALERT]: NVRAM battery is dangerously low. Halting system in 24 hours. Replace the battery immediately!
[nodename:monitor.shutdown.emergency:EMERGENCY]: Emergency shutdown: Environmental Reason Shutdown (Battery remain capacity critical)
- 节点关闭 ,
bmc_logs\bmc_status.txt
中出现EMS_nvmem_battery_fccLowCrit
和Battery PCT capacity critical
警报。
13:14:01 BMC env_mgr[1822]: Bat_Pct_Cap report 30 %, below threshold value, during active learning cycle
17:49:03 BMC env_mgr[1822]: Bat_Pct_Cap report 30 %, below threshold value, during active learning cycle
18:09:39 BMC env_mgr[1822]: Payload action: splog_get(1, (null))
18:18:25 BMC env_mgr[1822]: SPEM Restart
18:18:25 BMC env_mgr[1822]: ENVD_SES: Attempt made to change a locked threshold. Ignoring all but warning thresholds
18:18:25 BMC env_mgr[1822]: get_health_condition: 'Battery RemCap Desc' description is NULL
18:19:25 BMC env_mgr[1822]: Payload EMS: EMS_callhome_battery_failure(BATTERY (full charge capacity low))
18:19:25 BMC env_mgr[1822]: Payload EMS: EMS_nvmem_battery_fccLowCrit(30 %, 20)
18:38:27 BMC env_mgr[1822]: Payload action: env_halt(64, Battery PCT capacity critical)
18:48:37 BMC env_mgr[13393]: P3.3V is critical low (0 mV).
18:48:37 BMC env_mgr[13393]: P12V is critical low (186 mV).
18:48:37 BMC env_mgr[13393]: P12V Curr is critical low (0 mA).
18:49:01 BMC env_mgr[13393]: P5V is critical low (52 mV).
18:49:01 BMC env_mgr[13393]: PVDDQ DDR4 AB is critical low (0 mV).
18:49:01 BMC env_mgr[13393]: PVTT DDR4 AB is critical low (0 mV).
18:49:06 BMC env_mgr[13393]: PSU1 Present, update fru
18:49:06 BMC env_mgr[13393]: PSU2 Present, update fru
18:50:01 BMC env_mgr[13393]: PVCCIN CPU is critical low (0 mV).
18:51:06 BMC env_mgr[13393]: CPU Temp Margin is not readable.
18:51:06 BMC env_mgr[13393]: CPU Core Temp is not readable.
- 电池记忆周期结束后会发生关闭。
Record 310: Wed Dec 04 12:00:13.077637 2024 [IPMI.notice]: 002d | 02 | EVT: 0301ffff | Bat_Lrn_Active | Assertion Event, "State Asserted"
Record 311: Wed Dec 04 16:11:57.383621 2024 [IPMI.notice]: 002e | 02 | EVT: 0300ffff | Bat_Lrn_Active | Assertion Event, "State Deasserted"
Record 312: Wed Dec 04 16:16:16.348941 2024 [BMC.notice]: Defer switch update: SSH connection is active
Record 313: Wed Dec 04 16:31:57.409161 2024 [IPMI.emergency]: env_mgr triggers OS halt:Battery PCT capacity
Record 314: Wed Dec 04 16:32:53.177695 2024 [IPMI.notice]: 002f | 02 | EVT: 6f03ffff | Sensor 255 | Assertion Event, "Storage OS graceful shutdown"
Record 315: Wed Dec 04 16:32:53.000000 2024 [Controller.notice]: Appliance user command halt.
Record 316: Wed Dec 04 16:32:53.205464 2024 [IPMI Event.critical]: System power down"
- 故障LED亮起
- 关闭并重新打开电源后、节点无法启动、并显示以下错误。
WARNING: The battery is unfit to retain data during a power
outage. This is likely because the battery is
discharged but could be due to other temporary
conditions.
When the battery is ready, the boot process will
complete and services will be engaged.
To override this delay, press 'c' followed by 'Enter'