The case began when a advanced user accidently entered his account in SCSM 2010 SP1 console in Run As accounts. Later on he deleted his account from Run As accounts. At the time this happened SCSM 2010 SP1 was still not in production so it was not an issue. When the System Center Service Manager environment went in production alerts from SCOM 2007 R2 monitoring started to appear like this one:
The Health Service could not log on the RunAs account <ACCOUNT NAME> for management group <MANGEMENT GROUP NAME>. The error is Logon failure: unknown user name or bad password.(1326L). This will prevent the health service from monitoring or performing actions using this RunAs account.
From the alert we can see which actual account is causing these alerts. So I thought that I will open the SCSM console go to Administration pane and Run As accounts and delete the account from there. But for my surprise when I did that no user account was present there only service accounts that were working normally. The next step was to verify in SCSM event logs that this alert was actually there:
Log Name: Operations Manager
Source: HealthService
Event ID: 7000
Task Category: Health Service
Level: Error
Description:
The Health Service could not log on the RunAs account <ACCOUNT NAME> for management group <MANGEMENT GROUP NAME>. The error is Logon failure: unknown user name or bad password.(1326L). This will prevent the health service from monitoring or performing actions using this RunAs account.
When I looked at the logs I’ve found that this alert was logged almost every hour. I’ve checked the account in question in Active Directory and it was locked. This led me to the idea that the account was located somewhere with old password and was used by Service Manager. As the architecture of SCSM is similar to the SCOM architecture I’ve figured out that accounts were saved in ServiceManager DB and may be this user account was still stuck in the database because it was somehow not deleted properly.
I’ve made some digging over Internet and I’ve found this article: Best Practices: Service Manager 2010 Management Pack for Operations Manager 2007 R2. In point 6 you can see the same issue with a workaround proposed:
I Have Previously Deleted Run As accounts from the UI: If you have deleted Run As accounts from the UI, the symptom will be that you get an alert which tells you that a Run As account is invalid, and when you look at the credentials of the Run As account, you notice that it is not shown in the Run As account view in the Service Manager console.
You can either ignore the alert (if you close it, it will right back), or you can disable the monitor. We are currently looking into how we can help you get out of this state and will hopefully have a solution for SP1. I will make sure to update this post once we have a definitive plan.
Best Practice to Avoid this Issue: The best way to avoid this issue is to never delete Run As accounts from the UI. You can reuse existing Run As accounts by changing their name and/or credentials. If you would like to stop using a run as account, you can change its credentials to Local System and change the name to something easy to remember such as “Inactive.”
This way, you will not end up with stale Run As accounts which cause events to be placed in the Operations Manager event log.
As you see the issue exists in SCSM 2010 and SCSM 2010 SP1 CU3. After seeing this workaround I’ve contacted the user to verify that he entered his account in SCSM and later deleted it. User confirmed this was the case. I’ve decided to implement the workaround. I’ve entered the user credentials in Run As accounts again and later changed the account to System. The issue continued to exist as now I was receiving errors from health service that the user account could not logon locally on the SCSM server. I’ve decided to user my account as dummy account and replace the user’s account with mine in SCSM console. The result was that the health service as continuing to use the user’s account and after changing the password for my account I’ve noticed that logon failure alerts were logged for my account also. That was not smart move to use my account as dummy account . It may be called dummy move
.
So we now had two user accounts entered in SCSM database that were generating alerts. Clearly the workaround was not working in our case and clearly this was bug in SCSM 2010 SP1. I could try to delete the accounts from the database directly with some SQL query but as SQL is not my strong side and this was production service I’ve decided that Microsoft Support should be contacted to provide resolution. So case was logged to Microsoft. After several e-mails of communication and providing information to Premier Field Engineer and it the issue was identified as bug the FPE contacted the support group of SCSM. The support group of SCSM confirmed it was a bug. They also said that no hotfix is planned for release for this issue but they will provide us with workaround. The good part is that this issue is fixed in SCSM 2012 and deleting accounts from SCSOM 2012 console are deleted also from the database. I’ve verified it in my home test lab also. After several days we received the workaround in a form of SQL query that will delete the unneeded accounts. While waiting for the solution I’ve entered both user accounts with their current passwords in SCSM console in order not be locked by the health service using them.
Here is the SQL query that was provided (you should execute the queries again ServiceManager DB):
1: DECLARE @SecureStorageElementId uniqueidentifier; 2: -- change "GUID" to the SecureStorageElementID 3: -- of the invalid runas account 4: SET @SecureStorageElementId = 'GUID'; 5: BEGIN TRANSACTION 6: EXEC dbo.p_CredentialManagerStorageDelete @SecureStorageElementId; 7: COMMIT TRANSACTION
The GUIDs for the problematic accounts you can find by listing all accounts and their SecureStorageElementID:
1: select * from CredentialManagerSecureStorage
In our case we have found that every time we have deleted account and entered it again an new account was entered in ServiceManager DB with different SecureStorageElementID.
We tested the query first in Test environment. Than implemented the solution in our production environment. All went smooth and both user accounts were deleted from the SCSM database. Error were not logged in SCSM event log and in SCOM also. Before executing the queries against the ServiceManager DB make sure you have deleted the accounts in question first in SCSM console also.
Before actually implementing this solution in your environment I strongly recommend these actions:
1. Test the query in a Test environment.
2. Backup your production database before executing the queries.
This solution is provided “AS IS” with no warranties from Microsoft or me. Neither Microsoft nor me are responsible if you mess up your SCSM Database/SCSM Environment if you execute the procedure incorrectly.
Many thanks to Microsoft Support for providing us a workaround for fixing this issue. Another case solved.