SnapCenter 作业无法触发且长时间运行的作业终止
适用场景
SnapCenter 服务器 4.1.1x 及更高版本
问题描述
- 计划的默认作业、备份或克隆作业在 20 分钟内无法触发。
- 计划运行的默认作业、备份或克隆作业可能会提前终止,但在 WebUI 中显示仍在运行。
- 作业启动器在启动作业时也使用 SMCore 服务通过 SnapCenter Web 应用程序启动作业,导致在两个 IIS 事件之间无法启动备份(此时 IIS 不再处理传入请求)。SMcore
- 错误消息:
DEBUG SMCore_28089 PID=[7152] TID=[49] RemoteUrl: https://<SNAPCENTER_SERVER>:8146/JobStatusService.svc/UpdateJobStatus
ERROR SMCore_28089 PID=[7152] TID=[49] WebException in method: Invoke.
System.Net.WebException: The remote server returned an error: (500) Internal Server Error. - 系统事件日志将显示 ID 为 5074 (SnapCenter) 或 5186 (DefaultAppPool) 的信息 IIS 事件
"A worker process with process id of '<PID>' serving application pool 'SnapCenter' has requested
a recycle because the worker process reached its allowed processing time limit." - 30 分钟后(默认情况下),系统事件日志可能会显示 ID 为 5013 的警告事件:
"A process serving application pool 'SnapCenter' exceeded time limits during shut down. The process id was '<PID>'."
- 尽管 SMCore 已完成操作(备份、克隆等),但 SnapCenter 也可能会出现以下作业错误:
We noticed an IIS application pool recycle event when this job was in progress. The final state of this job is unkown.
Consider restarting the job if required.