SnapCenter 作业触发失败,长时间运行的作业终止
适用于
SnapCenter Server (SC) 5.x 及更高版本
问题描述
- 计划的默认作业、备份或克隆作业无法触发 20 分钟。
- 计划运行的默认作业、备份或克隆可能会提前终止,显然仍在 WebUI 中运行。
- 作业启动器在启动作业时,还使用 SMCore 服务通过 SnapCenter Web 应用程序启动它,该应用程序无法在两个 IIS 事件之间启动备份(此时 IIS 不再处理传入的请求)。
- SMcore 错误消息:
DEBUG SMCore_28089 PID=[7152] TID=[49] RemoteUrl: https://<SNAPCENTER_SERVER>:8146/JobStatusService.svc/UpdateJobStatus
ERROR SMCore_28089 PID=[7152] TID=[49] WebException in method: Invoke.
System.Net.WebException: The remote server returned an error: (500) Internal Server Error. - 系统事件日志将显示 ID 为 5074(SnapCenter)或 5186(DefaultAppPool)的信息性 IIS 事件
"A worker process with process id of '<PID>' serving application pool 'SnapCenter' has requested
a recycle because the worker process reached its allowed processing time limit." - 30 分钟后(默认情况下),系统事件日志可能会显示 ID 为 5013 的警告事件:
"A process serving application pool 'SnapCenter' exceeded time limits during shut down. The process id was '<PID>'." - 尽管 SMCore 已完成操作(备份、克隆等),但 SnapCenter 也可能会给出以下作业错误:
We noticed an IIS application pool recycle event when this job was in progress. The final state of this job is unknown.
Consider restarting the job if required.