跳转到主内容

由于DIMM降级、单个节点的性能较差且CPU使用率较高

Views:
6
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
perf<a>性能</a><a>延迟</a><a>内存</a><a>DIMM</a><a>2009668470</a>
Last Updated:

适用场景

  • ONTAP 9
  • AFF A400

问题描述

  • CPU高会导致一个节点性能不佳。
  • 数据聚合中的写入延迟较高。示例

Time         Node        Severity    Event
------------------- ---------------- ------------- ---------------------------
7/24/2023 18:33:25  node_name     ERROR      wafl.cp.toolong: Aggregate aggr_name experienced a long CP.
7/24/2023 18:15:22  node_name     ERROR      wafl.cp.toolong: Aggregate aggr_name experienced a long CP.

  • 发生崩溃后节点重新启动、并生成核心转储文件。示例

"process on cpu17 hung (telnet_0) for 5001 milliseconds! in SK process telnet_0 on release 9.10.1P12 (C"

  • DIMM模块中存在可更正的错误。示例

Number of correctable ECC since boot 60362216: Information about Correctable ECC: ECC error at DIMM-xx: CE-03-2106-18AEE039,ADDR 0x5959b3100,(Node(1), Memory controller(0), CH(0), DIMM(0), Rank(0), Bank Group(2), Bank(0x0), Row(0x52ad), Col(0x2c0))
Correctable Machine Check Error at CPU17 McBank7. SKL_IMC0 Error: STATUS<0xcc10000001010090> (...)

Number of correctable ECC since boot 60427752: Information about Correctable ECC: ECC error at DIMM-xx: CE-03-2106-18AEE039,ADDR 0x8698e9d00,(Node(1), Memory controller(0), CH(0), DIMM(0), Rank(1), Bank Group(0), Bank(0x0), Row(0x7d3f), Col(0x70))
Correctable Machine Check Error at CPU13 McBank7. SKL_IMC0 Error: STATUS<0xcc10000001010090> (...)

  • 已为该DIMM触发内存错误警报。示例

[node_name: mgwd: callhome.hm.alert.critical:debug]: Call home for Health Monitor process nphm: CriticalCECCCountMemErrAlert[DIMM-xx].

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.