由于 SG6160 SG110 和 SG1100 上的 PSU 故障,StorageGRID 设备计算控制器需要注意
适用于
- StorageGRID 11.9.x、11.8.x、11.7.x
- SG6160、SG110 和 SG1100。
问题描述
- Grid Manager 显示警报:
Appliance compute controller needs attentionAppliance storage controller hardware issueAppliance compute controller power supply A has a problemAppliance compute controller power supply B has a problem
- BMC UI 不显示 PSU 的任何问题。
- 网格管理器中未触发 PSU 故障警报。
/var/local/log/bycast.log显示:
Aug 27 09:23:13 SGnode24 ADE: |21043885 1600958990 RSMR %CEA 2025-08-27T09:23:13.954179| WARNING 0076 CCST: Unknown status for ipmi sensor: PSU0_StatusAug 27 09:23:13 SGnode24 ADE: |21043885 1600958990 RSMR %CEA 2025-08-27T09:23:13.954203| WARNING 0283 CCST: IPMI device Watchdog reports non-OK status 'reserved' 'reserved'Aug 27 09:23:13 SGnode24 ADE: |21043885 1600958990 RSMR %CEA 2025-08-27T09:23:13.954207| WARNING 0283 CCST: IPMI device CPU0 reports non-OK status 'SM BIOS `Uncorrectable CPU-complex Error'' 'Processor PresenceAug 27 09:23:13 SGnode24 ADE: |21043885 1600958990 RSMR %CEA 2025-08-27T09:23:13.954222| WARNING 0283 CCST: IPMI device Power_BP reports non-OK status 'At or Below (<=) Lower Non-Recoverable Threshold'- 在 StorageGRID 日志
base-os-logs\var\local\gemini-system-status\status.json报告中:"name": "CPU0","type": "Processor","reading": "N/A","units": "N/A","event": "'SM BIOS `Uncorrectable CPU-complex Error'' 'Processor Presence"