NS224 上的传感器 7 或传感器 13 温度过高警告
适用场景
- NS224磁盘架
- 运行 02.10之前固件版本的NSM100模块
问题描述
- 传感器7或13温度过高、这是NSM模块A或B上的CPU传感器
[node_name: dsa_worker4: ses.status.temperatureWarning:alert]: NS224NSM100 (S/N 123456789) shelf 0 on channel 0x temperature warning for Temperature sensor 7: non-critical status; overtemperature warning. Current temperature: 88 C
(190 F). This element is on the unknown location
[node_name: dsa_worker4: ses.status.temperatureWarning:alert]: NS224NSM100 (S/N 123456789) shelf 0 on channel 0x temperature warning for Temperature sensor 13: non-critical status; overtemperature warning. Current temperature: 88 C
(190 F). This element is on the unknown location
- 每小时EMS错误:
[node_name: statd: monitor.shelf.warning:error]: Fault reported on disk storage shelf attached to channel 0x. Check fans, power supplies, disks, and temperature sensors.
- storage show AutoSupport输出示例:
Enclosure Status: non-critical
Channel: 0x
Shelf: 0
Shelf Type: NS224NSM100
Module Type: NSM100
Temperature Sensors:
Element Status Status Bytes Status Descriptions
1: OK 01,00,2A,00
...
12: OK 01,00,2E,00
13: NONCRITICAL 03,00,6C,04 OT WARNING
14: OK 01,00,3C,00
...
17: OK 01,00,3D,00
Enclosure:
Element Status Status Bytes Status Descriptions
1: OK 01,00,02,00 FAIL
- Environment AutoSupport输出 示例:
Channel: 0x
Shelf: 10
SES device path: local access: 0x.10.1.99
Module type: NSM100; monitoring is active
Shelf status: non-critical condition
SES Configuration, shelf 10:
...
Temperature Sensor installed element list: 1, 2, 3, ...15, 16, 17; with error: 13
Shelf temperatures by element:
[1] 19 C (66 F) (ambient) Normal temperature range
...
[12] 22 C (71 F) Normal temperature range
[13] 88 C (190 F) Overtemperature warning!
::> storage shelf show
输出示例:
Temperature:
-- Thresholds °C --
Temp Is Low Low High High Operational
ID °C Ambient Crit Warn Crit Warn Status Sensor Location
--- ---- ------- ---- ---- ---- ---- ------------------ --------------------------------
1 19 true 0 5 52 47 normal ambient temp sensor on ODP board
2 24 false 0 5 60 55 normal temp sensor on midplane left
...
7 35 false 0 5 90 85 normal CPU package on top module
...
12 22 false 0 5 65 60 normal bottom module temp sensor near midplane
13 88 false 0 5 90 85 over-temperature CPU package on bottom module
14 45 false 0 5 85 80 normal Ethernet port 1 on bottom module
...
storage fault
输出示例:
::> system node run -nodde node_name -command storage show fault
Temperature Sensors:
Element Status Status Bytes Status Descriptions
1: OK 01,00,27,00
...
6: OK 01,00,33,00
7: NONCRITICAL 03,00,6C,04 OT WARNING
8: OK 01,00,4C,00
...
17: OK 01,00,44,00