Recently I was working one of my BizTalk developer machines and I started getting the below error in the event log which I hadnt noticed before. It had been happening for around 30 minutes.
SQL Server Scheduled Job ‘MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb’ (0xF6C545D46E4F2F489B99744A4B73E1DD) – Status: Failed – Invoked on: 2015-11-23 23:43:00 – Message: The job failed. The Job was invoked by Schedule 13 (Schedule). The last step to run was step 1 (Purge).
Checking the SQL activity view revealed the below additional error information.
Date 11/23/2015 11:43:00 PM
Log Job History (MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb)
Step ID 1
Server HK-T-BZ-S-1
Job Name MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb
Step Name Purge
Duration 00:00:00
Sql Severity 16
Sql Message ID 2812
Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0
Message
Executed as user: <Domain>\<User>. Could not find stored procedure ‘dbo.int_ProcessCleanup_<HostName>’. [SQLSTATE 42000] (Error 2812). The step failed.
I knew I had recently been making some changes so I investigated this a little further. I checked the stored proc which is ran by the MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb job and it checks a table called [BizTalkMsgBoxDb].[dbo].[ProcessHeartbeats]. In this table it has a list of things which I assume are used to manage the internal queues around BizTalk hosts. It looks like a bunch of stored procedures are created for each BizTalk host and one of these is the dbo.int_ProcessCleanup_<HostName> which could no longer be found. I could see the stored procedures for other hosts but not the one in question.
Now at this point I knew I had made some changes and in particular I knew I had deleted this particular host which had been successfully deleted. It looked however that although the host had been deleted, this host was still listed in the process heart beats table so therefore the MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb executes a stored procedure which uses that table and then from the rows in that table is looking to dynamically execute other stored procedures based on the name in the “Application Name” column in the table (side note here that the column application name seems to be badly named and actually means host name”.
Anyway so armed with a plausible reason for why this wasnt working the next question is how do I fix it. It seems that I need to get my host out of this table and that somehow it wasnt removed when I originally deleted the host.
I didnt want to take the chance of changing the table directly but fortunately it seemed that if I created a new host with the same name and then deleted that host again that the row from the ProcessHeartbeats table was removed and the SQL Job stopped getting an error the next time it ran.
I guess this is a rare scenario as I cant see anything online about it, but sharing in case anyone else gets this.