CFSHELF-1826:磁盘架中的温度传感器1出现SES.STATUS.THEATureError (...)温度错误
问题描述
- OPS前面板中的磁盘架警示LED亮起。
- AutoSupport环境输出、该传感器[1]中报告"故障")。示例:
Channel: 0a
Shelf: 0
SES device path: local access: 0a.00.99
Module type: IOM12; monitoring is active
Shelf status: critical condition
...
Temperature Sensor installed element list: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11; with error: 1
Shelf temperatures by element:
[1] 128 C (262 F) (ambient) Overtemperature failure!
[2] 20 C (68 F) Normal temperature range
...
[11] 31 C (87 F) Normal temperature range
- ONTAP事件消息示例:
::> event log show -event *shelf*
Time Node Severity Event
------------------- ---------------- ------------- ---------------------------
1/2/2025 10:35:00 node_name EMERGENCY monitor.globalStatus.critical: Disk shelf fault.
1/2/2025 10:34:28 node_name ALERT monitor.shelf.fault: Critical fault reported on disk storage shelf attached to channel 0b. Check fans, power supplies, disks, and temperature sensors.
1/2/2025 10:34:19 node_name ERROR ses.status.temperatureError: DS224-12 (S/N SHFHU2048000395) shelf 0 on channel 0c temperature error for Temperature sensor 1: critical status; overtemperature failure. Current temperature: 128 C (262 F). This module is on the front of the shelf on the left, on the OPS panel.
1/2/2025 10:33:52 node_name DEBUG stackmon.shelf.discovery.complete: One or more shelves have been discovered.
...