SLDIAG对与随机地址内存测试所替换的DIMM不同的DIMM返回错误状态
适用场景
问题描述
- 在 更换DIMM期间、 对内存子系统执行操作步骤诊断:
*> sldiag device run -dev mem
- 完成SLDIAG测试时、"随机 地址内存测试"可能会显示错误状态:
*> sldiag device status
To see details use: sldiag device status -long
Device TestName StartDate EndDate Status LOOP
---------- ---------------- ------------------- ------------------- ---------- --------
mem Memory Details Mon Jan 2 11:42:03 Mon Jan 2 11:42:03 Completed 1/1
mem Random Address Mem Test
Mon Jan 2 11:42:03 Mon Jan 2 11:48:46 ERROR 1/1
mem Random Data Mem Test
Mon Jan 2 11:48:46 N/A Running 0/1
There are still test(s) being processed.
- 进一步调查后、在与更换的DIMM不同的DIMM上发现可更正错误(CECC):
*> sldiag device status -dev mem -long -state failed
TEST START ------------------------------------------
DEVTYPE: mem
NAME: Random Address Mem Test
START DATE: Mon Jan 2 11:42:03 GMT 2023
STATUS: Completed
Running Random Address Mem Test
MAX memory in D-blade = 8080000000
RANGE:0 in D-blade = 0x0 to 0x7e000
Initial CECC count: 0
Checking for diag owned mem
RANGE:1 in D-blade = 0x80000 to 0x87000
RANGE:2 in D-blade = 0x100000 to 0x6111e000
RANGE:3 in D-blade = 0x6915f000 to 0x6a782000
RANGE:4 in D-blade = 0x6b96e000 to 0x7519c000
RANGE:5 in D-blade = 0x751a2000 to 0x7999f000
RANGE:6 in D-blade = 0x7b7ff000 to 0x7b800000
RANGE:7 in D-blade = 0x100000000 to 0x8080000000
Testing diag owned range 0x1d1d6aa300 to 0x6971fca300
2 CECC Reported on DIMM :7
Memory errors:CECC count: 2
END DATE: Mon Jan 2 11:48:46 GMT 2023
LOOP: 1/1
TEST END --------------------------------------------