Orchestrator 2012: Too much queued policy instances caused Orchestrator to slow down dramatically

Recently I had a situation with my System Center Orchestrator 2012 SP1 environment, where the Runbook Designer behaved strangely. I saw that when I started a runbook, it was not updating the log only the log history, when the runbook was finished. It also seemed to take longer than normal until the runbook was finished.
I started to check some things in my environment:

  • I checked the size of my database: with 2GB it was not too big
  • I checked the performance of my Management and Runbook servers. All looked normal.
  • I restarted the services. That did not help
  • I cleaned up some things in the DB => cleaned orphaned log entries from runbooks, deleted some old runbooks, which were not required anymore, purged the logs.
  • Then I checked the logging settings for all runbooks. With that I found one runbook, where the logging was enabled and it was currently running. But I could not stop it! It gave me an error like “Unable to un-deploy the runbook“. (sorry, I missed to create a screenshot of it 😉 ) I saw that the job history showed current entries and created always new ones. This runbook was invoked by another one, this invokation filled up the queue.

I searched around and found some SQL queries I could use to investigate more. So I logged on to the SQL server with the Orchestrator instance on it and ran the following query:

SELECT * FROM POLICY_PUBLISH_QUEUE

This gave me all instances of policies which were queued right now. And I had 350000 in there! That was the problem. I looked through the results and saw that most entries came from one policy/runbook. So I used this query to find more details about it:

SELECT POLICYINSTANCES.PolicyID ,POLICYINSTANCES.TimeStarted, POLICYINSTANCES.TimeEnded, POLICYINSTANCES.ProcessID, POLICYINSTANCES.SeqNumber, POLICIES.Name FROM POLICYINSTANCES INNER JOIN POLICIES ON POLICYINSTANCES.PolicyID = POLICIES.UniqueID WHERE POLICYINSTANCES.PolicyID = ‘PolicyID’

With that I could verify that it was the runbook, which was not stopping. So I used the next query to delete the entries from this policy out of the queue:

DELETE FROM [POLICY_PUBLISH_QUEUE] WHERE [PolicyID] =’PolicyID’

Now the queue only had 10 entries left in it :-).

I shrinked the database and checked the Orchestrator performance again and it was back to normal.

Wonderful!

Advertisements
Post a comment or leave a trackback: Trackback URL.

Comments

  • Lee Berg (@LeeAlanBerg)  On January 20, 2015 at 10:32 pm

    FYI, we have seen similar behavior where one runbook will go crazy and create a HUGE number of instances, we have seen this happen when running log purges while runbooks are running. My suspicion is that when purging logs or clearing orphaned runbooks it can cause a rare bug that can cause this to happen. Might not be the issue in your case but something to watch out for! Great Article!

Trackbacks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: