Virtual IPs for public addresses

Include the URL of your launchpad blueprint:

https://blueprints.launchpad.net/tripleo/+specs/tripleo-juno-virtual-public-ips

The current public IP feature is intended to specify the endpoint that a cloud can be reached at. This is typically something where HA is highly desirable.

Making the public IP be a virtual IP instead of locally bound to a single machine should increase the availability of the clustered service, once we increase the control plane scale to more than one machine.

Problem Description

Today, we run all OpenStack services with listening ports on one virtual IP.

This means that we’re exposing RabbitMQ, MySQL and possibly other cluster-only services to the world, when really what we want is public services exposed to the world and cluster only servers not exposed to the world. Deployers are (rightfully) not exposing our all-services VIP to the world, which leads to them having to choose between a) no support for externally visible endpoints, b) all services attackable or c) manually tracking the involved ports and playing a catch-up game as we evolve things.

Proposed Change

Create a second virtual IP from a user supplied network. Bind additional copies of API endpoints that should be publically accessible to that virtual IP. We need to keep presenting them internally as well (still via haproxy and the control virtual IP) so that servers without any public connectivity such as hypervisors can still use the APIs (though they may need to override the IP to use in their hosts files - we have facilities for that already).

The second virtual IP could in principle be on a dedicated ethernet card, or on a VLAN on a shared card. For now, lets require the admin to specify the interface on which keepalived should be provisioning the shared IP - be that br-ctlplane, vlan25 or eth2. Because the network topology may be independent, the keepalive quorum checks need to take place on the specified interface even though this costs external IP addresses.

The user must be able to specify the same undercloud network as they do today so that small installs are not made impossible - requiring two distinct networks is likely hard for small organisations. Using the same network would not imply using the same IP address - a dedicated IP address will still be useful to permit better testing confidence and also allows for simple exterior firewalling of the cluster.

Alternatives

We could not do HA for the public endpoints - not really an option.

We could not do public endpoints and instead document how to provide border gateway firewalling and NAT through to the endpoints. This just shifts the problem onto infrastructure we are not deploying, making it harder to deploy.

Security Impact

Our security story improves by making this change, as we can potentially start firewalling the intra-cluster virtual IP to only allow known nodes to connect. Short of that, our security story has improved since we started binding to specific ips only, as that made opening a new IP address not actually expose core services (other than ssh) on it.

Other End User Impact

End users will need to be able to find out about the new virtual IP. That should be straight forward via our existing mechanisms.

Performance Impact

None anticipated.

Other Deployer Impact

Deployers will require an additional IP address either on their undercloud ctlplane network (small installs) or on their public network (larger/production installs).

Developer Impact

None expected.

Implementation

Assignee(s)

Primary assignee:

lifeless (hahahaha)

Other contributors:

None.

Work Items

  • Generalise keepalived.conf to support multiple VRRP interfaces.

  • Add support for binding multiple IPs to the haproxy configuration.

  • Add logic to incubator and/or heat templates to request a second virtual IP.

  • Change heat templates to bind public services to the public virtual IP.

  • Possibly tweak setup-endpoints to cooperate, though the prior support should be sufficient.

These are out of scope for this, but necessary to use it - I intend to put them in the discussion in Dan’s network overhaul spec.

  • Add optional support to our heat templates to boot the machines with two nics, not just one - so that we have an IP address for the public interface when its a physical interface. We may find there are ordering / enumeration issues in Nova/Ironic/Neutron to solve here.

  • Add optional support to our heat templates for statically allocating a port from neutron and passing it into the control plane for when we’re using VLANs.

Dependencies

None.

Testing

This will be on by default, so our default CI path will exercise it.

Additionally we’ll be using it in the up coming VLAN test job which will give us confidence it works when the networks are partitoned.

Documentation Impact

Add to the manual is the main thing.

References

None