由于 打开的文件过多、Mellanox开关挂起
适用场景
- Mellanox SN2010
- Onyx版本3.9.3220
问题描述
- 交换机已挂起、在重新启动之前未响应。
- 可以登录到命令行界面、但不会返回提示符以执行命令。
- 可以登录到WebUI、但 无法管理交换机
- 系统转储日志示例:
SNMP持续尝试发送陷阱、但失败。
Line 68934: Jul 16 16:13:28 DC-ENCOA-FL5-SN2010-21 snmpd[5488]: [snmpd.ERR]: snmpd: send_trap: Failure in sendto (No route to host)
Line 68935: Jul 16 16:13:28 DC-ENCOA-FL5-SN2010-21 snmpd[5488]: message repeated 8 times: [ [snmpd.ERR]: snmpd: send_trap: Failure in sendto (No route to host)]
Line 68936: Jul 16 16:13:28 DC-ENCOA-FL5-SN2010-21 snmpd[5488]: [snmpd.ERR]: snmpd: send_trap: Failure in sendto (No route to host)
Line 68937: Jul 16 16:13:28 DC-ENCOA-FL5-SN2010-21 snmpd[5488]: message repeated 8 times: [ [snmpd.ERR]: snmpd: send_trap: Failure in sendto (No route to host)]
Line 68938: Jul 16 16:13:28 DC-ENCOA-FL5-SN2010-21 snmpd[5488]: [snmpd.ERR]: snmpd: send_trap: Failure in sendto (No route to host)
因此、以下日志指示了大量打开的文件:
Line 68918: Jul 16 16:13:23 DC-ENCOA-FL5-SN2010-21 mgmtd[6612]: [mgmtd.ERR]: lc_launch_pre_fork(), proc_utils.c:726, build 1: Too many open files: Making temp file with base name /vtmp/proc-output