Load Balancing VMWare Photon Platform 1.2 with Citrix Netscaler

Updated 22/06/2017 : Fixed HSTS Policy configuration on the Netscaler (was bound as a Request policy not a Response Policy) and updated quorum discussion after testing with three Photon Controller Nodes

Greetings; this post is the forth in a series for June where I have been focusing on the Photon Platform 1.2. This post outlines how to publish the application through a Citrix Netscaler and my findings along the way. The Photon Platform deployment appliance will deploy a HAProxy Load Balancer as part of the deployment. The HAProxy deployment has the following limitations when deployed using the photon-setup installer;

  1. Only one Load Balancer VM is deployed (requires HAProxy skills to scale the LB platform)
  2. The installer does not allow for an “external URI” to be defined for Lightwave  or the Photon Controller which means that the Cookies issued and the Auth Redirects behave incorrectly if using DNS names (as of yet I have not been able to figure out how to change these settings through API/config files)
  3. As of yet The SSL certificates are all signed by the Lightwave infrastructure and are not trusted
  4. The way the Deployment appliance deploys the solution the response that returns from https://PhotonController:9000/v1/system/auth only returns a single IP for Lightwave which is not dynamically updated; if this endpoint goes offline it does not get updated (Issue 134 has been logged regarding this)
  5. When the primary Lightwave server goes offline users can’t authenticate as the Open ID Connect Clients are not registered on additional servers and further the endpoint registration does not occur on the Photon Controller Platform
  6. The Photon Controllers are deployed in a cluster with (by default) a Majority Quorum; the Quorum setting can be adjusted via the API.

So out of the box there are a few issues with the scalability and availability of the Management Platform however you can still replace the Load Balancing and improve the availability (a little bit), several issues (Issue 133 and Issue 134) discovered during the prototyping have been logged on Github. I will continue to look for ways to address these issues however be aware of the following points of failure in the Photon Platform 1.2 platform I have found during my testing:

  1. Authentication (Lightwave) : All configuration is on a single instance
  2. The Management User Interface : Will not work if the Load Balancer is offline (the API still functions on Port 9000 but 443 stops working)

Hopefully as these issues are addressed this document will evolve and the Platform will become more resilient to failures however I would recommend deploying the Netscalers in HA mode to address HA Issue #2 listed above.

Configuration

Step 1. Allocate VIP, create some DNS records and Generate some SSL Certificates

The first step is to allocate a Virtual IP addresses (VIP) that will be used for providing Public access to the backend services (Lightwave and Photon Controller) and perform any NAT/Firewalling to make these accessible to clients. Once this has been done create DNS (A) records for the two services (Lightwave for Authentication and Photon Platform for the Management/API access) eg. lightwave.pigeonnuggets.com & photonplatform.pigeonnuggets.com

Next generate some Web Server certificates against your Enterprise PKI/Public Certificate authority with the Subject Name (and Subject Alternative Names) set to the DNS records created.

Step 2. Prepare Lightwave

Lightwave is used to generate the access tokens (in the form of a cookie) that is used by Photon Controller to authenticate client requests. When a client contacts the Photon Controller without an access token the Photon Controller constructs a 302 Redirect for the client which contains;

  1. The domain to authenticate
  2. The Claims that the token should contain
  3. Where the clients should be redirected back to after authentication
  4. A client id and;
  5. The hostname of the originating request

Lightwave will then generate a cookie with the Domain Name that matches the provided host in the Redirect URI and pass this to the requester for use with Photon. In order for Lightwave to service the request to validates if the “Redirect_URI” provided is allowed for the Client with the Client Id provided and as such we need to add the new domain names to the list of allowed URI’s. To do this;

  1. Navigate to the LightWave Domain Controller administration page (https://lightwavefqdn/lightwaveui/) and enter the LightWave domain and when prompted enter the LightWave administrator account
  1. Select Service Providers from the side menu and select Open ID Connect Client and locate the Client ID for the Photon Controller Management UI; you can locate this by examining the 302 Redirect URL you get when navigating to the Photon Controller via the IP or by looking for the entry with “https://<IP of load balancer>:4343/logout_callback” and clicking Edit
  1. Amend the properties of the Logout URI: https://<fqdn of DNS record for Photon Controller LB>:4343/logout_callback and enter the following URI’s and click Add followed by Save
  1. Clean-up/remove any references to the default load balancer if you intend to decommission it/not allow connections via IP also by clicking the X next to the relevant objects

Step 3. Configure Netscaler Backend Objects and Request Rewrite Policies

The diagram (click here for a PDF version) outlines the configuration on the Load Balancer to deliver the Lightwave and Photon Controller Platform management plane to users. It is important to ensure that Photon Platform continues to work that the VIP for the Load Balancer is the IP that was assigned to the HAProxy machine. The Photon Controller uses the External URI (the IP) when there is more than one Photon Platform deployed to make backend web requests and if you don’t reuse the VIP it breaks.  I don’t as yet have a better solution but during testing I wasn’t able to amend these external_URI values successfully…watch this space.

As mentioned in Step 2 I was not able to determine how to set/change the “External URI” post deployment for the Lightwave and the Photon Controller platforms and as a result some of the responses to/from the client and the Photon controller must be rewritten with the correct external URI’s (instead of internal IP addresses).

Health Monitors and Load Balancing Objects

The following configuration are the basic Load Balancing objects that need to be defined and the health sets; for the health of the Photon Controllers an API call is made to https://server:9000/v1/available which returns if the service is available

Rewrite and Responder Policy

The following rewrite and responder policies are used to;

  • Insert the Required Headers
  • Replace the Internal IP addresses on the headers with Load Balancer VIP DNS names;
  • Replace the API Authentication endpoint URI;
  • A redirect to Port 4343 if a browser client hits the root of the API service

Also find my complete ns.config for the solution above with the passwords removed for the Certificates and the nsroot password however this should be pretty easy to implement.

The Result

Hopefully you should now have the solution Load Balanced through the Netscaler via DNS names and not directly via IP. I hope you find this information helpful and saves you some pain. Hopefully some of the issues with the Lightwave authentication configuration so that the solution can be deployed in a highly available, scalable manner. Enjoy.