Handling frequent occurrences of the Robot error "Job UpdateMilestoneVariances has failed"
Who is this article for?Server Administrators encountering this error in their log files.
Administrator permissions are required to resolve the issue.
When reviewing the Robot log, you might see frequent occurrences of the message "Job UpdateMilestoneVariances has failed". This article explains what causes the issue and how to resolve it.
1. Issue
When looking through the Robot error log you might see entries that look like this:
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 19, 09/18/2024 23:02:00, Entering: SendRejectionEmails.ActionUpdate
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:00, Calculation.UpdateMilestoneVariances executing at 09/18/2024 23:02:00, next fire: 08/20/2024 23:02:00 +00:00
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:00, entering: JobFactory::MakeJob
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:00, entering: JobFactory::CreateJob: ExecuteProcedure
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:00, Entering: UpdateMilestoneVariances
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 19, 09/18/2024 23:02:01, Exiting: SendRejectionEmails.ActionUpdate
ERROR: , Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:01, *****An exception occurred at Monday, August 19, 2024 11:02:00 PM using version 6.0.1.7
1. Exception of type: MessageSecurityException - Message: An unsecured or incorrectly secured fault was received from the other party. See the inner FaultException for the fault code and detail.
Server stack trace:
at System.ServiceModel.Security.SecuritySessionClientSettings`1.SecurityRequestSessionChannel.ProcessReply(Message reply, TimeSpan timeout, SecurityProtocolCorrelationState correlationState)
at System.ServiceModel.Security.SecuritySessionClientSettings`1.SecurityRequestSessionChannel.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Dispatcher.RequestChannelBinder.Request(Message message, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at Pentana.Tng.Messaging.ITngService.ExecuteProcedures(RequestMessage`1 request)
at Pentana.Tng.Agent.TngServiceProxy.ExecuteProcedures(RequestMessage`1 request)
at Pentana.Tng.Agent.WcfProxyManager`2.ExecuteProcedures(RequestMessage`1 request)
at Pentana.Tng.Agent.AgentHelperInstance.RemoteCall[TRequest,TResponse](IList`1 payload, ServiceRequestMethod`2 execute, Boolean readOnly)
at Pentana.Tng.Agent.AgentHelperInstance.ExecuteProcedures(IList`1 calls, Boolean readOnly)
at Pentana.Tng.Agent.AgentHelperInstance.ExecuteProcedure(ProcedureCall call, Boolean readOnly)
at Pentana.Tng.ServerRobot.ExecuteProcedure.DoWork()
at Pentana.Tng.ServerRobot.JobBase.Invoke()
1.1. Exception of type: FaultException - Message: The message could not be processed. This is most likely because the action 'www.pentana.com/Tng/ServiceInterface/TngService/ExecuteProcedures' is incorrect or because the message contains an invalid or expired security context token or because there is a mismatch between bindings. The security context token would be invalid if the service aborted the channel due to inactivity. To prevent the service from aborting idle sessions prematurely increase the Receive timeout on the service endpoint's binding.
**************************************************
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:02, CREATING EMAIL
ERROR: , Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:02, >>> Failed to send email because no recipients were defined
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:02, entering: RetryAgent
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:03, entering: RetryAgent Successful
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:03, entering: AgentStatus: True
ERROR: , Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:04, Job UpdateMilestoneVariances has failed. ExecutionTime: 4.406003
VERBOSE:, Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:08, Exiting: UpdateMilestoneVariances
ERROR: , Pentana.Tng.ServerRobot.RobotService, 10, 09/18/2024 23:02:08, An error occured when running the job Calculation.UpdateMilestoneVariances
The pattern to note is:
- The Robot is entering two jobs at the same time:
- SendRejectionEmails.ActionUpdate, at 23:02:00
- Calculation.UpdateMilestoneVariances, also at 23:02:00
- The latter job fails, with the message "An error occured when running the job Calculation.UpdateMilestoneVariances".
This is because under some circumstances, particularly if the server is already quite busy, the second job either hangs or times out.
2. Solution
The solution is to change the start time of the SendRejectionEmails job. Normally it will run each minute at zero seconds past the minute, but we can change this to 35 seconds past the minute, using the cron value:
35/50 * * * * ?
This ensures that both jobs start at different times and minimises the likelihood that there will be a lock. To implement the change:
- At the application server, open App Manager and select the correct application instance.
- Go to the Robot tab, and click Robot Service, then Connect.
- Click into the Cron field and change the value to:
35/50 * * * *
- Click Save.
- Finally, go to Services in Control Panel, and restart the Robot service (normally called ServerRobotPentanaPRD).
3. Other causes
The UpdateMilestoneVariances job may time out if the server is under load. So this error may be seen when, for example, Windows Updates are installing. As the job runs daily, the issue will normally not reoccur, and the job can be run manually via App Manager if necessary.