定期 BMC 冷启动后, IPMI 无响应且 nodeOffline 重复 重置
适用场景
- NetApp Element 软件11.x、12.0和12.2
- NetApp SolidFire SF 系列产品线
问题描述
NodeOffline
BMC 重置后不久发出警报- NetApp SolidFire Active IQ 中可能出现的错误
nodeOffline - The SolidFire Application cannot communicate with node ID {ID}.
sensorReadingFailed - IPMI diagnostics are currently unresponsive. Please contact support if this problem persists.
unresponsiveService - A master service is not responding.
- Active IQ 中的事件:
Beginning BMC cold reset and setting new reset date
Setting BMC cold reset date
- 条目自
sf-master.info
master-1[30228]: [Event] 30325 GlobalPool-0 serviceshared/EventReporter.cpp:582:ReportEvent|Successfully reported event={id=569216 type=PlatformHardwareEvent nodeID=6 serviceID=107 message=[Beginning BMC cold reset and setting new reset date] details={"bmcResetDate":"2021-09-02T12:49:41","bmcResetDurationMinutes":20160} reported=2021-08-19T12:49:41.644056Z published=2021-08-19T12:49:41.644104Z} mNumEventsPublished=21
core.HangDetect
可以生成