NVDIMMBadithAlert Reason: VPP Lost,在配备NVDIMM的系统上
适用场景
- ONTAP 9
- AFF A800、AFF A400、AFF A320
- FAS8700、FAS9300
- NVDIMM- 运行状况说明:VPP丢失
 
问题描述
EMS日志:
Tue Apr 11 13:53:17 -0400 [NetApp1: nphmd: hm.alert.raised:alert]: Alert Id = NVDIMMBadHealthAlert , Alerting Resource = /dev/nvdimm0:NetApp1 raised by monitor controller
AutoSupport日志、我们会看到以下内容:
- 平台传感器.XML
---------------------------------------------------------------
 Sensor Name     Sensor Type      Sensor State
 --------------------------------------------------------------
 NVDIMM0 VPP      discrete     fault
 NVDIMM0 Health  discrete     fault
  
- NVDIMM状态
Total NVDIMM on this platform is 1
 --------------------------------------------------
 DIMM(/dev/nvdimm0) Page:0
 --------------------------------------------------
 0x0000: 00 0a 0a 01 01 00 21 3c   25 3c 25 34 34 00 00 00
 0x0010: 1f 2e 00 00 03 15 03 0b   68 81 02 00 68 81 1e 80
 0x0020: 05 80 78 80 05 00 00 40   1f ac 0d d0 07 a0 0f 90
 0x0030: 33 08 08 dc 05 10 80 00   00 00 70 03 00 00 00 00
 0x0040: 00 00 00 00 00 00 00 00   00 00 00 00 00 00 00 00
 0x0050: 00 00 00 00 00 00 00 00   00 00 00 00 00 00 00 00
 -----------------------------------------------------
 --------------------------------------------------
 DIMM(/dev/nvdimm0):
 --------------------------------------------------
   Controller Ready: Yes
   Controller Busy: No
   Energy Policy managed by: HOST
   Save_N Low During CSAVE: Yes
   Save_N Enabled(ARMED): Yes
   Data on the Flash: NotValid
 
   Module is Health: No
   Module Status(0x0004): VPP Lost 
   Flash Lifetime: 96%
   Flash Lifetime Status: Normal
运行状况警报:
::>system health alert show
| 节点 | Node01 | 
|---|---|
| 监控 | controller | 
| Alert ID | NVDIMMBad运行 状况警报 | 
| 警报资源 | /dev/nvdmm0:NetApp 1 | 
| 子系统 | 主板 | 
| 指示时间 | 星期二四月11 13:53:17 2023 | 
| 感知严重性 | 重大 | 
| 可能是发生原因 | 硬件降级 | 
| 问题描述 | 节点"NetApp 1"上的NVDIMM "NVDIMM N 0 (DIMM - 11)"指示降级状态。 | 
| 更正操作 | 请联系技术支持以获得有关更换NVDIMM模块的帮助。 | 
| 可能的影响 | NVDIMM降级可能会导致数据丢失。 | 
| 确认 | false | 
| 禁止 | false | 
| 策略 | NVDIMMBadithPolicy | 
| 追加信息 | 节点:NetApp 1 | 
- 重新启动节点不会修复问题描述。