Note to Readers

This post also gets picked up by myITforum and published there. It started off with CM12 lab building issues and has morphed into covering CM12 in general.

Another App Catalog Fix

We installed the App Catalog today in production and got the dreaded "Cannot connect to the application server" error when opening the catalog.  There are a number of reasons this can happen.  Microsoft has listed many of them here.  We came up with a new reason on our own: TLS 1.0.  Knowing that SSL 3.0 and TLS 1.0 are no longer considered secure protocols, we disabled them long ago.  There are a number of places in CM where disabling those can break things.  Well, you can add the app catalog to the list.  It needs TLS 1.0 enabled (both client and server).  Enabling without restarting any services cleared the error up immediately.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0\Server]
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.0\Client]

I found no blog posts for forum questions mentioning this setting, but I assume as more people move towards removing old protocols, it might pop up more often.  And I do expect Microsoft to address this at some point so I can go back and disable TLS 1.0 again (and hopefully 1.1 as well).

CM NAA Peer Perms

Direct from the developer at Microsoft regarding the CM network access account: it does NOT need full permissions to the CM cache folder for Peer Cache. I thought this was the case myself and evidently documentation will be coming soon regarding this. Full permission to me seemed so insane that I didn't bother testing it and simply assumed Peer Cache to be a beta product not worthy of looking at. I happily stand corrected! Now all they need is some throttling. :-)

If you have a CAS (you shouldn't), and if you have enabled distributed views, you might want to hold off from upgrading to SP1 for CM12 R2.

It sounds like we'll need a hotfix for the upgrade so that certain tables and views are checked for during the upgrade. When you enable distributed views, views are created to show data left on primary sites that is not on the CAS. The upgrade isn't expecting those so when it goes to recreate tables, the name is in use (as are some keys in indexes) and the upgrade fails. The ugly part for me is that it failed so far in that running recovery using my SQL backup wouldn't work.

So will this fail for you too? I'm not sure. It might just be my certain layout that wasn't tested. Let me describe it.

CAS

PR1 PR2

PR1 has just the hardware inventory node enabled for distributed views.

PR1

PR2 has all three links enabled.

PR2

How much you extend your inventory affects the total number of distributed views created. In my case, I had 321 of them. But it's just the PR2 tables and views that the upgrade got upset over; the ones where one site had all links enabled for DV. What if PR1 had all three links enabled too? Would I have had the problem? What if PR2 had only the hardware inventory node enabled? Would I have had the problem? I don't know. Will you have the problem? I wouldn't take the chance.

To get past this issue, I nuked some tables and views:

DROP TABLE [dbo].[CollectedFiles_RCM]
DROP TABLE [dbo].[FileUsageSummary_RCM]
DROP TABLE [dbo].[FileUsageSummaryIntervals_RCM]
DROP TABLE [dbo].[MonthlyUsageSummary_RCM]
DROP TABLE [dbo].[SoftwareFile_RCM]
DROP TABLE [dbo].[SoftwareFilePath_RCM]
DROP TABLE [dbo].[SoftwareInventory_RCM]
DROP TABLE [dbo].[SoftwareInventoryStatus_RCM]
DROP TABLE [dbo].[SoftwareProduct_RCM]
DROP TABLE [dbo].[SoftwareProductMap_RCM]
DROP TABLE [dbo].[SummarizationInterval_RCM]
DROP VIEW [dbo].[CollectedFiles]
DROP VIEW [dbo].[FileUsageSummary]
DROP VIEW [dbo].[FileUsageSummaryIntervals]
DROP VIEW [dbo].[MonthlyUsageSummary]
DROP VIEW [dbo].[SoftwareFile]
DROP VIEW [dbo].[SoftwareFilePath]
DROP VIEW [dbo].[SoftwareInventory]
DROP VIEW [dbo].[SoftwareInventoryStatus]
DROP VIEW [dbo].[SoftwareProduct]
DROP VIEW [dbo].[SoftwareProductMap]
DROP VIEW [dbo].[SummarizationInterval]
DROP VIEW [_sde].[v_GeneralInfo]
DROP VIEW [_sde].[v_GeneralInfoEx]
DROP VIEW [_sde].[v_GS_AppInstalls]
DROP VIEW [_sde].[v_HR_NSV]
DROP VIEW [_sde].[v_MachineUsage]

So after restoring the CM database from backup, dropping the views and tables above, and then running the upgrade, it finally took. The CAS is now at SP1 and replication is looking good. The only reason I'm posting the views and tables above is in case someone else already got themselves in trouble. I wouldn't do this unless it's already too late. And those last views we created in our own schema, but the upgrade still doesn't like them so if you have any of your own, you might want to makes copies, blast them, and put them back after the upgrade.

Long story short, if you're using distributed views, I'd recommend you wait on SP1 until we hear from Microsoft.

Update: Notice that the views above are all related to the software inventory and software metering link. As I mentioned in the lab, one of my site had all three links set for DV and one primary was marked for DV for just hardware inventory. Well in production we have only the hardware inventory link enabled for DV so we decided to move forward with SP1 there and it worked fine. So if there is an issue, it would only be with the software inventory and software metering link. Now is there an issue? I sent our database off to Microsoft but never heard back.

Ignore Ignite?

At our last user group meeting we discussed the inevitability of the cloud in IT and what that would mean for the future of IT Pros. One thing we all agreed on was that knowing PowerShell was probably the best investment of time right now for hope of having a meaningful job down the road (and heck, really today). It was also rather clear that for most attendees, we still have a hard time just doing today's job and continue to look for help via our user group and conferences. Microsoft Ignite came up and few seemed interested in attending. Why not?

Ignite is seen more as a crowded marketing show where a search shows 91 sessions listed for System Center (but that is a cloudy list) with crowded hotels and daily busing in needed. But we have so many better options today:

Each conference is on track to repeat around the same time and location each year so attendees can plan on making at least one and budget for them in advance.

These smaller conferences give attendees a better chance to network with others. With SCU, you can attend a user group broadcasting it in your area so that you can talk about the sessions you just saw with the rest of the group and go over common issues and ideas. And SCU is free. So you have no excuse not to go. Even if you have no local group you can watch from home or work. The speakers there are all the top speakers out there. MNSCUG plans to simulcast SCU next week.

I went to my first Connections conference last year in Vegas and was surprised how well it went. Smaller rooms and a crowd not too spread out from System Center. In fact, many of the CM sessions bore me simply because the product hasn't changed much over the past few years, so I found myself drifting into SQL sessions (something all System Center products rely on). They were great. There should be a good 80-90 System Center sessions this year. And the Aria is just a gorgeous hotel!

And then there's my favorite: MMS. It's right here at the MoA. It's just 3 days, but very long days. Early risers can start with birds of a feather sessions and sessions can start as late as 6pm (some with beer served!). Small rooms and many great speakers where "attendees don't get lost in the crowd." Feedback for the 1st year was overwhelmingly positive. An evening party plus the mall and a great bar right at the hotel make after hours mingling with others easy and fun. No busing to a convention center, no long lines, no crappy catered food. We've also revised Henry Wilson's old comparison doc as it might help get you funding. And MMS sessions from 2014 are still up to give an idea of what 2015 sessions should look like. And we just got word that our dates should be Nov 9-10-11 this year.

ADR RBA YES

When will Microsoft ever get Role Based Access (RBA) working for Automatic Detection Rules (ADRs)? I need to know that a server admin can make use of an ADR to setup his patches and that a workstation admin can't go in and edit the server ADRs. And vice versa.

Well, RBA is there. Already. Right now. At least in CM12 R2 it is. Was it always there? I could swear that when RTM came out, that this wasn't possible. But I verified this works yesterday. What isn't there is the option to right-click an ADR and assign the scope, but that's really not important.

The server admin can see the workstation admin's ADRs, but all the properties are grayed out and no changes can be made. The guts of this (as with all RBA) revolves around the collections each admin has access to. When a server admin creates an ADR which targets his collection that a workstation admin doesn't have access to, RBA kicks in and protects the admin.

So what's not to like about ADRs now?

Well, other than wishing they'd use saved searches instead of filters (which is another DCR submitted long ago) not much. I have just one thing driving me nuts before I let the admins know that they can start using ADRs now. Packages.

You can't make an ADR without filling out the packages prompts in the wizard. I'd have to let these admins also make patch packages on their own. And I can even grant that specific feature in our SUM role. So why could this be bad, especially if our single instance store in the Content Library is saving us space?

Well for one, it isn't saving us space on the source files (and for that I really need to move that share to a dedupe volume). But the other one is that one admin could now download a patch everyone is using and later just go delete it and break a lot of deployments. Sure, I could go fix that by downloading the patch myself quickly, but that could leave clients sitting around for a day before they retry. Maybe I'm over thinking this?

How to melt a SUP

We have 3 primary sites under a CAS (bad, but we have no choice with so many clients). Because we also have Nomad, we don't care where clients get assigned. We care only that each site has roughly the same client count as the others. But we drifted about 30K clients too many on one site and simply made use of CM12 R2's function to move clients. So we moved them to level set the count.

The downside, and we knew this, was that each client would have to do a full inventory and SUP scan. That's a lot of traffic but we've done this before without issue. But this time we melted the SUPs with many full scans. And the wonderful Rapid Fail detection built into IIS decided to protect us by stopping our WSUS App pool. Late at night.

Now in CM12 post SP1 (we're on R2), clients make use of the SUPList which is a list of all possible SUPs available. Clients find one SUP off that list and stick to it. They never change unless they can't reach their SUP after 4 attempts (30 minutes between each - the 5th attempt is to a new SUP). Well with the app pool off, all clients trying to scan would fail and start looking for new SUPs. A new SUP means a full scan. A full scan from 110K clients is far worse than from just 10K when we're moving things. Needless to say our SUPs were working very hard the next morning to serve clients. On a normal day the NIC on one of our SUPs shows about 1Mbps of traffic, but after starting the WSUS App pool we were at over 850Mbps going out per SUP.

Disabling Rapid Fail is one nice fix to help keep that app pool from stopping, but we also increased the default RAM on that from 5GB to 20GB (the SUPs have 24GB so we were clearly wasting most of that). I know of another company who has 85K clients on 2 SUPs who boosted their RAM from 24 GB to 48 GB to help IIS serve clients. Another option is to add more SUPs, but RAM is probably cheaper than another VM. This default Private Memory Limit is 5GB, so for those of us weirdoes with lots of RAM, it makes sense to crank this up if you can. We actually did this long ago, but we're thinking the Server 2012 R2 upgrade over Server 2012 wiped our settings out.

By the way, the obvious 'treatment' during such a meltdown is to throttle IIS. We set our servers down to 50 Mbps and the network team was happy; your setting will vary based on client count and bandwidth. Our long term insurance here will be QoS. UPDATE: Jeff Carreon just posted a tidbit on how to throttle quickly in case of an emergency using PowerShell.

So how do we keep our settings? We ask Sherry who knows DCM! Read more on her CIs to enforce our settings here.