BUGFIX : Federated Identity in vCloud Director – Cannot remove Entity ID for SAML identification for org when Regenerate of Certificate

UPDATED 20/05/2017: This issue has now been fixed in vCloud Director for Service Providers – upgrading to this version will solve this issue.

Happy Friday; a quick write up on a bug affecting vCloud Director SAML Identity Provider component. The bug manifested after an Identity Provider was configured for one ADFS Server and then changed to another. After the change when attempting to perform the Regenerate Certificate function Cannot remove Entity ID for SAML identification for org was thrown and HTTP 500 ERROR java.servlet.ServletException : Error initializing metadata when accessing Metadata for Federation


A bug exists that is known to occur if Federation has been configured previously and then changed to a new identity provider.

Known Affected: vCloud Director for Service Providers (all versions including 8.20)

VMWare Support have advised that this is a known issue and Engineering have a fix which will be implemented in the next release. For now the following will get you back up and running.

The following assumes your vCloud database is running on MSSQL and named vcloud; substitute queries as required to meet your environment.

Step 1. Take a backup of the vCloud Director database
Step 2. Logon to the tenancy and uncheck the Use SAML Identity Provider

Step 3. Execute the following query to get the OrgId for the affected Organization

SELECT [org_id] ,[name],[description]
FROM [vcloud].[dbo].[organization]

Step 4. Identify the SAML Policy Id by executing the following query against the Identity_Provider table

SELECT [id], [org_id], [provider_type],[provider_definition_id],[is_enabled]
FROM [vCloud].[dbo].[identity_provider]
WHERE [org_id] = <OrgId>

Step 5. Set the metadata to A blank value for the provider definition id by executing the following:

UPDATE saml_id_provider_settings set metadata = ” where id = <Provider_definition_id from Step 4.>

Verify by executing the query

SELECT [id], [metadata]
FROM [vCloud].[dbo].[saml_id_provider_settings]
WHERE id = <Provider_definition_id from Step 4.>

Step 6. Execute the following query and verify that the entity_id is set to a blank value and not set to NULL for the Organisation  

SELECT [org_id], [expiration_date],[is_cert_expiry_notified],[entity_id],[role_attribute]
FROM [vCloud].[dbo].[federation_settings]
WHERE [org_id] = <Org Id>

Step 7. Set the value to NULL by performing an UPDATE

UPDATE federation_settings SET entity_id = NULL where org_id = <OrgId from Step 3>

Step 8. Log back into vCloud Director and click Regenerate on the affected Org

Step 9. Verify the change has been successful by clicking the Metadata link; the metadata should generate correctly and all functions should now be restored without throwing a HTTP 500

Step 10. Setup your SAML Identity Provider; QED

ADFS 4.0 Nuggets/Gotchas

Today I had my first ADFS 4.0 (Windows Server 2016) deployment for a customer and found a few little gotcha’s that you might run into all with some pretty quick fixes;

Issue 1: IdpInitiatedSignonPage is disabled by default
This is usually the first test performed to check if ADFS is working as expected; to fire-up a browser and navigate to  https://domain.tld/adfs/ls/IdpInitiatedSignon.aspx – this will throw “An error occurred”. On your ADFS Server you will see Event 364 in the Event log with the critical piece of information in the Exception “IdPInitiatedSignonPageDisabledException”

Resolution: Logon to the Farm Primary and execute Set-AdfsProperties -EanbleIdpInitiatedSignOnPage $true

Issue 2: When attempting to add a new ADFS Server to the Farm during the Pre-requisite check you receive “The HTTP request was forbidden with client authentication scheme ‘Anonymous’” and “Unable to retrieve configuration from the primary server. The HTTP request was forbidden with client authentication scheme ‘Anonymous’

Resolution: There is some kind of introspection of the traffic; in my case there was a HTTP Proxy configured on the server, remove the proxy and no issues. Don’t forget to check netsh winhttp proxy as well

Issue 3: The next one came about after an upgrade/migrating of ADFS to a new Windows Server 2016 server ADFS throwing 400 and Kerberos errors in the event log (Event ID 4)

Resolution: As the error indicates this is an SPN issue; find the Service Account and update the servicePricnipalName attribute to include the value which is causing the issue (http/XXXXXXXXX)

Otherwise speaking ADFS 4.0 is generally very similar to ADFS in Windows Server 2012 R2 and is a pretty straight forward deployment/upgrade. Happy implementing !