Storepool所有者/OpenState耗尽导致NFSv4文件访问失败
适用场景
- ONTAP 9
- NFSv4.0
- NFSv4.1
问题描述
- NFS客户端无法打开NFSv4文件、 读/写操作挂起、删除操作成功
- NFS客户端应用程序崩溃
- 可以观察到CPU利用率较高
- "
cd
""ls
""touch
"等命令挂起 EMS.log
报告 导致存储池耗尽的不同Nblade.nfsV4PoolThreshold
错误-Nblade.nfsV4PoolExhaust:EMERGENCY
:[node-01: kernel: Nblade.nfsV4PoolThreshold:notice]: NFS Store Pool for OpenState is nearing exhaustion (80% of pool currently in use).
[node-01: kernel: ems.engine.suppressed:debug]: Event 'Nblade.nfsV4PoolThreshold' suppressed 3477 times in last 204 seconds.[node-01: kernel: Nblade.nfsV4PoolThreshold:notice]: NFS Store Pool for OpenState is nearing exhaustion (90% of pool currently in use).
[node-01: kernel: ems.engine.suppressed:debug]: Event 'Nblade.nfsV4PoolThreshold' suppressed 99337 times in last 61 seconds[node-01: kernel: Nblade.nfsV4PoolThreshold:notice]: NFS Store Pool for OpenState is nearing exhaustion (99% of pool currently in use).
[node-01: kernel: ems.engine.suppressed:debug]: Event 'Nblade.nfsV4PoolThreshold' suppressed 139821 times in last 61 seconds.[node-01: kernel: Nblade.nfsV4PoolExhaust:EMERGENCY]: NFS Store Pool for OpenState exhausted. Associated object type is CLUSTER_NODE with UUID: 69b7a0c0-8dea-11ed-bcfe-d000eaa1111d.
注意:其他存储池资源也可能受到影响:
[node-01: kernel: Nblade.nfsV4PoolExhaust:EMERGENCY]: NFS Store Pool for Open exhausted. Associated object type is CLUSTER_NODE with UUID: 69b7a0c0-8dea-11ed-bcfe-d000eaa1111d.
or
node01紧急Nblade.nfsV4PoolExhaust:所有者的NFS存储池已用尽。关联对象类型为UUID为39865的cluster节点935-
7a8e-11ef-8ccf-d039eaa50000.
- 出现以下指向客户端IP的
nfs4sequesnceInvalid
警报。
Wed Dec 04 12:16:30 -0800 [node-n1: kernel: nblade.nfs4SequenceInvalid:notice]: NFS client (IP: 172.23.xxx.xvc) sent sequence# 21, but server expected sequence# 20. Server error: BAD_SEQID.
Wed Dec 04 12:19:48 -0800 [node-n1: kernel: nblade.nfs4SequenceInvalid:notice]: NFS client (IP: 172.23.xxx.xvc) sent sequence# 25, but server expected sequence# 24. Server error: BAD_SEQID.
- 可通过检查事件
PerClientStorePoolThreshold
或收集其他数据来确定前几个使用者IP
::> event log show -event PerClientStorePoolThreshold
- 确定排名靠前的使用者挂载的数据接口(lf)
::> network connections active show-clients -remote-address <Topconsumer>