Removing DTA orphans in BizTalk

Table of Content

What is this?

This is a re-release of an old article about BizTalk. I wrote the original in 2014 and removed it in a blog migration. However, a colleague asked about it so I am re-writing it.

What are Orphans?

When tracking is performed in BizTalk, it takes several different steps. The first thing is to create a row in the dta_ServiceInstances table in the DTA database. There is a bug in BizTalk that prevents the line from being removed by the DTA Purge and archive cleaning job. This creates an Orphan and if there are many, the table is filled with unnecessary junk.

Why?

When tracking starts the field dtStartTIme is set to the current UTC time. In the same row the field dtEndTime will be set once the flow is completed. Sometimes it does not. This seems to be related to a high number of exceptions or the use of request response with a SendPort group (do not do that!).

The row should be removed by the clean up job, but that job only removes completed rows, i.e. rows with a dtEndTime that is not null.

How many Orphans?

I think about 1000 is too many. To find out the number of orphans in your BizTalk installation run the script below. It should be run in BizTalkDTADb.

select
    count(*) as 'NoOfOrphans'
from 
    [BizTalkDTAdb].[dbo].[dta_ServiceInstances]
where 
    dtEndTime is NULL and [uidServiceInstanceId] NOT IN 
    (
    SELECT 
        [uidInstanceID]
    FROM 
        [BizTalkMsgBoxDb].[dbo].[Instances] WITH (NOLOCK)
    UNION
    SELECT 
        [StreamID]
    FROM 
        [BizTalkMsgBoxDb].[dbo].[TrackingData] with (NOLOCK)
    )

The query clearly shows that you need to not only look in the DTA-database, but also count the number of instances still in the MessageBox, meaning active messages. If you want more information on the instances just do a SELECT * instead of SELECT count(*). You can clearly see that the data has a dtStartTime but not a dtEndTime.

file

How to remove them

The BizTalk Terminator tool

I do not know if this exists or is supported anymore. Be advised! The below text is form 2014.

This is the supported way of removing orphans. It is, however, important to point out that the entire BizTalk environment needs to be stopped for it to work, and if that is not a useful alternative, you can run the actual script that the Terminator tool runs.

The T-SQL Script from the terminator tool

This simple script updates the orphaned rows and sets dtEndTime to the current UTC time, making it possible for the cleanup job to remove the row.

BEGIN TRAN
USE [biztalkDTADb]

UPDATE
    [dbo].[dta_ServiceInstances]
SET 
    [dtEndTime] = GetUTCDate()
WHERE
    dtEndTime is NULL 
    AND 
    [uidServiceInstanceId] NOT IN
    (
    SELECT 
        [uidInstanceID]
    FROM
        BizTalkMsgBoxDb.[dbo].[Instances] WITH (NOLOCK)
    UNION 
    SELECT 
        [StreamID]
    FROM
        BizTalkMsgBoxDb.[dbo].[TrackingData] WITH (NOLOCK)
    )
-- If it works, uncomment the following row and run it. It commits the query to the database
-- Commit tran
-- If it DOES NOT work: uncomment the following row and run it. It undoes the query.
-- Rollback tran

Note that the script has the same query for finding orphans as the above that only counts the number of orphans.
In order to perform this update by the book, flow these steps:

  1. Run the first query to find the number of orphans.
  2. Run the second script and note the number of rows that was updated.
  3. If the number of rows matches you can uncomment and run the Commit Tran row. Done.
  4. If the number of rows does not match, something is wrong and you must run the Rollback Tran row. Not done.

NOTE! You must run both the query and the Rollback or Commit in the same query window.

If you go to number 4, check the scripts and make sure you are running the scripts on the same database. If you have a constantly increasing number of orphans in your DB, that might also be the reason.

How are the orphans removed

The rows can now be deleted by the DTA Purge and archive job, in accordance with the job settings. This might take a while as your settings might let older tracking data be in the database for several days. Also, the job runs once every minute so you just might need to wait for 60 seconds.