One key Service Manager performance issue that the product group keeps hearing about surrounds connectors; mainly the System Center Configuration Manager (ConfigMgr) and Active Directory (AD) connectors, both of which are based on Linking Framework (LFX). Running these connectors currently might take weeks to complete the initial sync depending on the number of configuration items required to be synced.
In a typical Configuration Manager deployment, out of all data synced by the ConfigMgr 2012 connector, the most time is taken by the tables DeviceHasSoftwareItemInstalled and DeviceHasSoftwareUpdate. The DeviceHasSoftwareItemInstalled table contains the information on software installed on computers and the DeviceHasSoftwareUpdate table contains the updates installed on computers. The data in these tables can be in the range of 50 to 200 times the number of computers that ConfigMgr is monitoring. For example, for a deployment of 1500 computers, the DeviceHasSoftwareItemInstalled table might contain up to 80,000 entries and the DeviceHasSoftwareUpdate table may have up to around 150,000 entries. Due to the size, these two tables account for 80-90% of the sync time, depending on the size of deployment of course.
Why does sync take so much time?
The main reason this takes so long is that data insertion is serialized. The way the LFX connector works is that it submits data in batches (typically 500-2000 entries) which are serialized. The data is inserted through the connector and needs to go in ECL for other workflows to act on the data. This puts a restriction of inserting data serially into the SQL DB so that a workflow can process information for one batch in one go.
In addition, the DeviceHasSoftwareItemInstalled table can take far more time to insert data per batch (as compared to DeviceHasSoftwareUpdate for the same amount of data) because with the DeviceHasSoftwareItemInstalled table there is an additional check to make sure that duplicate data is not entered, which might occur in cases where software is repaired or reinstalled on the system. This additional step needs to be done for all entries in a batch which leads to an additional toll on each sync batch.
We are happy to announce that have fixed these scenarios in the following way:
1. Data Insertion is serialized– We made the insertion operate in parallel for multiple batches in an LFX session. Please note that although in code we have made this insertion parallel, insertion in ECL (SQL DB) is still serialized.
2. DeviceHasSoftwareItemInstalled table– The SQL query was optimized to find duplicate software entries for all entries in a batch in a single query. Now this is running much faster and we have seen improvements of up to 5x for this specific table.
With these fixes in place we have observed overall improvements of up to 3x. One point to be noted is that as deployments gets bigger, the improvements might initially diminish. The reason for this is the fact that the ECL table becomes very large when a connector is running, which in turn makes insertion per batch slower. Having said that, subsequent delta syncs will see higher improvements as ECL will be in a much cleaner state during subsequent runs.
As mentioned earlier, the LFX framework is shared by both the ConfigMgr connector and the AD connector, so with the first fix (data insertion in parallel) we have observed that the AD connector is taking around 35% less time as compared to the time observed with the UR5 payload.
This fix is targeted to be released in Update Rollup 6 (UR6), however if you wants to get your hands dirty with this fix before then, you can get it right now as part of the TAP 17 Drop. For more details on TAP and how to join, please see the following:
Become a Member of the System Center Service Manager Agile TAP!
System Center 2012 Service Manager SCSM 2012 R2