Apps/Packages stuck in “In progress” state when distributing to DPs

For almost a week spent troubleshooting these PIA app and a package that got stuck in “In progress” state trying to deploy to the DPs. Other 1000 apps and packages we have are fine, except for these two. Oh and, Google-fu and/or Tae-BING searches weren’t good or helpful enough to fix this issue! J  Of course, recreating these would have been the easier way.   But why are they in that state and how could we get this fixed? I just want to find a way to reset these objects so I can deploy these to the DPs successfully and know what to do when/if it happens again.

So I looked at all basic app/package properties to see if everything’s setup properly; checked the source path, DT settings, distribution settings, content settings, etc.   I even tried changing the source path to I know for sure a valid path where CAS can get to. The CAS is able to grab the content from the source location, pack it to PCK, replicate the app/package settings to child primary servers (via DRS), and also able to send to them via sender. But when it gets to the primary servers, despooler was blowing chunkies…   It wouldn’t process the .sni file properly, even though I see the TRY file that came with it in the despooler. It kept coming up with Error=12! Tried removing the DPs from the app, resetting the pkgstatus/SourceVersion=0, and Status=2, then re-adding the DPs back, no dice.  This app just kept falling in Retry state! Ugh!

Despooler.log on the primary servers

 

Received package CAS008DB version 6. Compressed file -  D:\SMSPKG\CAS008DB.PCK.6 as D:\SMS\inboxes\despoolr.box\receive\PKGfooh3.TRY

Instruction D:\SMS\inboxes\despoolr.box\receive\ds_2ivq1.sni won't be processed till 6/20/2014 1:42:00 PM Central Daylight Time

Instruction D:\SMS\inboxes\despoolr.box\receive\ds_ijdl4.sni won't be processed till 6/20/2014 1:13:50 PM Central Daylight Time

Instruction D:\SMS\inboxes\despoolr.box\receive\ds_oko21.sni won't be processed till 6/20/2014 1:31:10 PM Central Daylight Time

Instruction D:\SMS\inboxes\despoolr.box\receive\ds_vy0p3.sni won't be processed till 6/20/2014 1:36:40 PM Central Daylight Time

Instruction D:\SMS\inboxes\despoolr.box\receive\ds_xzqrp.sni won't be processed till 6/20/2014 2:08:40 PM Central Daylight Time

Waiting for ready instruction file....

Old package storedUNC path is .

This package[CAS008DB]'s information hasn't arrived yet for this version [6]. Retry later ...

Created retry instruction for job 00005921

Despooler failed to execute the instruction, error code = 12

So I started digging, and compared a successful APP vs this bad one, and was surprised to found this on CAS’s pkgstatus SQL view. The successfully deployed to the DPs app only has one row per Primary, one for the CAS, and its DPs that’s deployed to starting with “["Display=\\DP1.jeff.com\"]MSWNET:…”.   This bad application happens to have extra rows per primary server along with CAS’s fqdn in PkgServer column!   And if you look closely below, their “Update times” were old with different or old PKID. (Which I assume PKID increments).

StuckApps

                       

Time to try to fix this!

  1. I made certain all the DPs are removed from this bad application “CAS008DB
  2. I then proceeded by deleting these extra rows by executing below on the CAS and on the Primary servers’ DB.   NOTE: MS doesn’t support you modifying the DB, so be careful and make sure you have a valid backup before doing so! J

DELETE FROM pkgstatus

where id = 'CAS008DB' and PkgServer = 'CASSERVER.jeff.com' and sitecode <> 'CAS'

  1. Then I reset the pkgstatus of this application. Executed this on the CAS only, targeting just one of the primary servers PR41.jeff.com, just to see if we could get past the despooler process successfully so it can get copied to its DPs.

update pkgstatus set Status = 2 where id = 'CAS008DB' and pkgServer = 'PR41.jeff.com'

update pkgstatus set SourceVersion = 0 where id = 'CAS008DB' and pkgServer = 'PR41.jeff.com'

  1. Then deployed the app to the DPs that are on PR41.jeff.com primary server.
  2. Voilà! The bad application got processed by the despooler and deployed to the targeted DPs successfully!

Verifying signature for instruction D:\SMS\inboxes\despoolr.box\receive\ds_c50g8.nil of type MICROSOFT|SMS|MINIJOBINSTRUCTION|PACKAGE

Signature checked out OK for instruction coming from site CAS, proceed with the instruction execution.

Executing instruction of type MICROSOFT|SMS|MINIJOBINSTRUCTION|PACKAGE

Package CAS008DB is currently being processed, sleep for 10 seconds

Waiting for the next instruction....

Waiting for ready instruction file....

Old package storedUNC path is .

Use drive D for storing the compressed package.

No branch cache registry entries found.

Uncompressing D:\SMSPKG\CAS008DB.PCK to D:\SMSPKG\CAS008DB.PCK.temp

Content Library: O:\SCCMContentLib

Extracting from D:\SMSPKG\CAS008DB.PCK.temp

Extracting package CAS008DB

Extracting content Content_122fcdf2-f4c6-43d0-a0fe-61caf6f67a23.1

Package CAS008DB (version 0) exists in the distribution source, save the newer version (version 7).

Stored Package CAS008DB. Stored Package Version = 7

STATMSG: ID=4400 SEV=I LEV=M SOURCE="SMS Server" COMP="SMS_DESPOOLER" SYS=PR41.jeff.com SITE=P41 PID=2924 TID=8196 GMTDATE=Wed Jun 25 21:24:37.393 2014 ISTR0="CAS008DB" ISTR1="\\PR41.jeff.com\D$\SMSPKG\CAS008DB.PCK" ISTR2="" ISTR3="" ISTR4="" ISTR5="" ISTR6="" ISTR7="" ISTR8="" ISTR9="" NUMATTRS=1 AID0=400 AVAL0="CAS008DB"

Despooler successfully executed one instruction.

How did this happen? I have still no clue at the moment, I’m still digging for the root cause.   I can only assume we had a SAN glitch or CM crash at the same time this app was being processed or created. The best part is, now I know why it wouldn’t deploy to the DPs, and I know what to look for and what to do when/if this happens again J

CM12

  • Created on .
Copyright © 2018 - The Minnesota System Center User Group