跳转到主内容
NetApp adopts Microsoft’s Business-to-Customer (B2C) Identity Management
Effective December 3 - NetApp adopts Microsoft’s Business-to-Customer (B2C) identity management to simplify and provide secure access to NetApp resources. For accounts that did not pre-register (prior to Dec 3) access to your NetApp data may take up to 1 hour as your legacy NSS ID is synchronized to the new B2C identity. To learn more, Read the FAQ and Watch the video. Need assistance? Complete this form and select “Registration Issue” as the Feedback Category. 

由于长期的重放缓存桶不足、客户端延迟极长和 / 或挂起

Views:
22
Visibility:
Public
Votes:
0
Category:
ontap-9
Specialty:
nfs
Last Updated:

适用于

  • ONTAP 8.3.x 及更高版本
  • NFS

问题

根据 ONTAP 统计信息、可以从延迟细分部分的 opm Grafana perfstat/perfarchive 观察到高延迟。 in the latency breakdown section.  根据用于监控性能的工具,大部分延迟来自 "CPU_NETWORK "或 "cluster_interconnect" 。

可以从各种日志中找到多个错误 / 警告:

  1. perfstat "" 中从请求刀片( Nblade )观察到的 CSM 超时sysctl sysvar.csm,可以通过运行 "" 从 SystemShell 手动收集相同的输出sysctl sysvar.csm

示例输出

SpinNPSessionInt::timeout): this=0xffffff80085e1028, sessionId=(req=cluster_n01:nblade, rsp=cluster_n02:dblade, uniquifier=00053816b2747090): In last 3974071360 ms, 104 of 2168524218 Ops timed out, 2171533701 started, 0 Ops timed out unsent. 4289664640/0/0 Ops await replies, 0 segs sent, 0 await ACKs

  1. CSMFlowControl 在 perfstat "" 中的接收器节点( Dblade )上sysctl sysvar.csm,可以通过运行 "" 从 SystemShell 手动收集相同的输出sysctl sysvar.csm

输出示例

SpinNPSessionInt::processSessionFlowcontrolQueue): sess = 0xffffff8007bdf028, sessionId = (req=c55f68b8-7cc0-11e4-84e6-098b9834504d, rsp=cluster_n02:dblade, uniquifier=00053816b2747090), iface = 1, delivered REQUEST pkt = 0xffffff05931fa271 to flow control list

  1. nblade 。 nfsconnResetandclose - 可以从 EMS 日志中找到 "Maximum number of rewind reats has been exceeded " 。

输出示例

Nblade.nfsConnResetAndClose: Shutting down connection with the client. Vserver ID is xx; network data protocol is NFS; client IP address:port is xx.xx.xx.xx:xxx. local IP address is xx.xx.xx.xx; reason is CSM error - Maximum number of rewind attempts has been exceeded.

  1. 从 perfstat ‘stats spinnp' 部分观察到的 Spinnp 延迟异常值较高,请检查并确保它在迭代之间递增。也可以从statistics show -object spinnp -rawClusterShell ( diag 模式)运行 "" 来手动收集相同的输出。

输出示例

spinnp:spinnp:latency_hist.<1s:2577819
spinnp:spinnp:latency_hist.<2s:7878237
spinnp:spinnp:latency_hist.<4s:6262884
spinnp:spinnp:latency_hist.<6s:1629240
spinnp:spinnp:latency_hist.<8s:307280
spinnp:spinnp:latency_hist.<10s:85273
spinnp:spinnp:latency_hist.<20s:145299
spinnp:spinnp:latency_hist.<30s:51447
spinnp:spinnp:latency_hist.<60s:30
spinnp:spinnp:latency_hist.<90s:10
spinnp:spinnp:latency_hist.<120s:6
spinnp:spinnp:latency_hist.>120s:50

  1. Spinhi 统计信息 表明,几乎所有 Spinhi 请求都在延迟队列中,可以从 perfstat 部分中找到,spinhi_stats'也可以通过运行 "spinhi_stats" ( diag 模式)从 nodeshell 中手动收集。

输出示例

(spinhi_stats) size=39502 total_req=421874001827 cur_req=25780 max_req=26702 total_resp=421873962781 total_replay_resp=289138 defer_req=55765 cur_defer=25780 max_defer=25780 hipri=15603269 unmarshal_errs=0 marshal_errs=0 fastpath_null_resps=0 cur_nogrow_filecb_bulk=0, cur_nogrow_filecb_op=0 redo=131995, max_nogrow_filecb_bulk=0 max_nogrow_filecb_fileop=0 Access: count=44862084546 hipri=0 errs=77411717 elapsed: max=14087030.76 avg=280.45

cur_req: Current number of requests in SpinHi
cur_defer: Current number of requests in SpinHi Defer Queue
If cur_defer == cur_req, that means, all the current requests at Spinhi are in the Defer Queue
Counter "spinnp_replay_max_long_term_hit" increments across iterations in a perfstat section 'stats spinnp_replay_cache', for example:
spinnp_replay_cache:spinnp_replay_cache:spinnp_replay_max_long_term_hit:20467472
spinnp_replay_max_long_term_hit: Total number of times max long term limit was hit"

 

CUSTOMER EXCLUSIVE CONTENT

Registered NetApp customers get unlimited access to our dynamic Knowledge Base.

New authoritative content is published and updated each day by our team of experts.

Current Customer or Partner?

Sign In for unlimited access

New to NetApp?

Learn more about our award-winning Support