环境传感器输出因超时操作失败
适用场景
- ONTAP 9
- AF-A800
- 服务处理器 (SP)
- 基板管理控制台(BMC)
问题描述
- 尝试从节点获取环境传感器输出时报告以下错误:
::> system environment sensors show -node Node-01
Node Sensor State Value/Units Crit-Low Warn-Low Warn-Hi Crit-Hi
---- --------------------- ------ ----------- -------- -------- ------- -------
Node-01
PSU2 FRU normal
GOOD
PSU1 FRU normal
GOOD
NVDIMM0 Health normal
GOOD
NVDIMM1 Health normal
GOOD
SP Status normal
IPMI_HB_OK
Warning: Unable to list entries on node Node-01. Timeout: Operation "bmc_client_bmc_get_sensor_info_iterator::modify_imp()" took longer than 25 seconds to complete [from mgwd on node "Node-01" (VSID: -1) to servprocd at 127.0.0.1]
Error: show failed: Timeout: Operation "bmc_client_bmc_get_sensor_info_iterator::modify_imp()" took longer than 25 seconds to complete [from mgwd on node "Node-01" (VSID: -1) to servprocd at 127.0.0.1]
5 entries were displayed.
- 系统节点电源显示的输出也失败,并出现超时错误:
::> system node power show
Node Status
---------------- -----------
Error: Timeout: Operation "bmc_client_system_power_status_iterator::modify_imp()" took longer than 25 seconds to complete [from mgwd on node "Node-01" (VSID: -1) to servprocd at 127.0.0.1]
Node-01 -
Node-02 -
2 entries were displayed.
从受影响的节点可以看到以下事件
SP-MGMT-MLOG-TXT.GZ
部分:[kern_servprocd:info:55561] 0x809824c00: 8503e800016b9888: NOTICE: Servprocd::CLI: get_spcs_port : spcs port value in sp_api_service mdb is 50000
[kern_servprocd:info:55561] 0x809824c00: 8503e800016b9888: ERR: Servprocd::CLI: sp_get_sensor_info_worker : TTX ERROR: 2 (THRIFT_POLL (timed out))
[kern_servprocd:info:55561] 0x809824c00: 8503e800016b9888: ERR: Servprocd::CLI: sp_get_sensor_info_worker : EXCEPTION Occured in closing the spcc transport
[kern_servprocd:info:55561] Thrift: Fri Jul 4 00:14:51 2025 SSL_shutdown: shutdown while in init (SSL_error_code = 1)
[kern_servprocd:info:55561] 0x809824c00: 8503e800016b9888: ERR: Servprocd::bmc_client: bmc_get_sensor_info_imp: target_node=Node-01, err=The operation timed out.- SP API 服务内部证书根据此 KB 进行更新: Active IQ Wellness Risk: ONTAP在与SP/ BMC通信时报告错误
- 重新启动BMC并重新启动服务处理器守护程序后没有任何变化。