由于文件描述符泄漏导致出现 " 打开的文件过多 " 错误, Active IQ Unified Manager Web 界面无响应
适用场景
Active IQ Unified Manager (AIQUM) 9.9或更早版本
问题描述
- AIQUM WebUI无响应
- 触发的API调用失败
Error 500 - Internal Server Error
- Grafana无法创建AIQUM报告
server.log
显示Too many open files
错误:
ERROR [org.jboss.as.server.deployment.scanner] (DeploymentScanner-threads - 1) WFLYDS0012: Scan of /opt/netapp/essentials/jboss/standalone/deployments threw Exception: java.lang.RuntimeException: WFLYDS0032: Failed to list files in directory /opt/netapp/essentials/jboss/standalone/deployments. Check that the contents of the directory are readable.
...
Caused by: java.nio.file.FileSystemException: /opt/netapp/essentials/jboss/standalone/deployments: Too many open files
Exception handling request to /apis/XMLrequest: java.lang.RuntimeException: java.io.IOException: Cannot run program "/opt/netapp/essentials/jboss/bin/native/lib64/authenticate": error=24, Too many open files
ocumserver.log
Got IO exception while processing access_log
使用进行指示Too many open files
Access Log Task
INFO [oncommand] [Access Log Task] [com.netapp.ipc.util.AccessLogTask] <YEAR>-<MONTH>-<DATE> is older than 30 days
ERROR [oncommand] [Access Log Task] [com.netapp.ipc.util.AccessLogTask] Got IO exception while processing access_log
java.nio.file.FileSystemException: /var/log/ocie/<YEAR>-<MONTH>-<DATE>: Too many open files
- 命令
lsof -p `cat /var/run/ocie.pid` | awk '{print $9}' | sort | grep "/var/log/ocie/20" | uniq -c
显示/var/log/ocie/<YEAR>-<MONTH>-<DATE>
超过30天的目录的大量文件描述符
174 /var/log/ocie/2019-08-26
172 /var/log/ocie/2019-08-27
:
34 /var/log/ocie/2021-07-13
98 /var/log/ocie/2021-07-14
98 /var/log/ocie/2021-07-15
40 /var/log/ocie/2021-07-16