Provision nodes without Nova and Glance

https://blueprints.launchpad.net/tripleo/+spec/nova-less-deploy

Currently TripleO undercloud uses Heat, Nova, Glance, Neutron and Ironic for provisioning bare metal machines. This blueprint proposes excluding Heat, Nova and Glance from this flow, removing Nova and Glance completely from the undercloud.

Problem Description

Making TripleO workflows use Ironic directly to provision nodes has quite a few benefits:

  1. First and foremost, getting rid of the horrible “no valid hosts found” exception. The scheduling will be much simpler and the errors will be clearer.

    Note

    This and many other problems with using Nova in the undercloud come from the fact that Nova is cloud-oriented software, while the undercloud is more of a traditional installer. In the “pet vs cattle” metaphore, Nova handles the “cattle” case, while the undercloud is the “pet” case.

  2. Also important for the generic provisioner case, we’ll be able to get rid of Nova and Glance, reducing the memory footprint.

  3. We’ll get rid of pre-deploy validations that currently try to guess what Nova scheduler will expect.

  4. We’ll be able to combine nodes deployed by Ironic with pre-deployed servers.

  5. We’ll become in charge of building the configdrive, potentially putting more useful things there.

  6. Hopefully, scale-up will be less error-prone.

Also in the future we may be able to:

  1. Integrate things like building RAID on demand much easier.
  2. Use introspection data in scheduling and provisioning decisions. Particularly, we can automate handling root device hints.
  3. Make Neutron optional and use static DHCP and/or os-net-config.

Proposed Change

Overview

This blueprint proposes removal replacing the triad Heat-Nova-Glance with Ironic driven directly by Mistral. To avoid placing Ironic-specific code into tripleo-common, a new library metalsmith has been developed and accepted into the Ironic governance.

As part of the implementation, this blueprint proposes completely separting the bare metal provisioning process from software configuration, including the CLI level. This has two benefits:

  1. Having a clear separation between two error-prone processes simplifies debugging for operators.
  2. Reusing the existing deployed-server workflow simplifies the implementation.

In the distant future, the functionality of metalsmith may be moved into Ironic API itself. In this case it will be phased out, while keeping the same Mistral workflows.

Operator workflow

As noted in Overview, the CLI/GUI workflow will be split into hardware provisioning and software configuration parts (the former being optional).

  1. In addition to existing Heat templates, a new file baremetal_deployment.yaml will be populated by an operator with the bare metal provisioning information.

  2. Bare metal deployment will be conducted by a new CLI command or GUI operation using the new deploy_roles workflow:

    openstack overcloud node provision \
       -o baremetal_environment.yaml baremetal_deployment.yaml

    This command will take the input from baremetal_deployment.yaml, provision requested bare metal machines and output a Heat environment file baremetal_environment.yaml to use with the deployed-server feature.

  3. Finally, the regular deployment is done, including the generated file:

    openstack overcloud deploy \
       <other cli arguments> \
       -e baremetal_environment.yaml \
       -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \
       -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-bootstrap-environment-centos.yaml \
       -r /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-server-roles-data.yaml

For simplicity the two commands can be combined:

openstack overcloud deploy \
   <other cli arguments> \
   -b baremetal_deployment.yaml \
   -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-environment.yaml \
   -e /usr/share/openstack-tripleo-heat-templates/environments/deployed-server-bootstrap-environment-centos.yaml \
   -r /usr/share/openstack-tripleo-heat-templates/deployed-server/deployed-server-roles-data.yaml

The new argument --baremetal-deployment/-b will accept the baremetal_deployment.yaml and do the deployment automatically.

Breakdown of the changes

This section describes the required changes in depth.

Image upload

As Glance will no longer be used, images will have to be served from other sources. Ironic supports HTTP and file sources from its images. For the undercloud case, the file source seems to be the most straightforward, also the Edge case may require using HTTP images.

To make both cases possible, the openstack overcloud image upload command will now copy the three overcloud images (overcloud-full.qcow2, overcloud-full.kernel and overcloud-full.ramdisk) to /var/lib/ironic/httpboot/overcloud-images. This will allow referring to images both via file:///var/lib/ironic/httpboot/overcloud.images/... and http(s)://<UNDERCLOUD HOST>:<IPXE PORT>/overcloud-images/....

Finally, a checksum file will be generated from the copied images using:

cd /var/lib/ironic/httpboot/overcloud-images
md5sum overcloud-full.* > MD5SUMS

This is required since the checksums will no longer come from Glance.

baremetal_deployment.yaml

This file will describe which the bare metal provisioning parameters. It will provide the information that is currently implicitly deduced from the Heat templates.

Note

We could continue extracting it from the templates well. However, a separate file will avoid a dependency on any Heat-specific logic, potentially benefiting standalone installer cases. It also provides the operators with more control over the provisioning process.

The format of this file resembles one of the roles_data file. It describes the deployment parameters for each role. The file contains a list of roles, each with a name. Other accepted parameters are:

count
number of machines to deploy for this role. Defaults to 1.
profile
profile (compute, control, etc) to use for this role. Roughly corresponds to a flavor name for a Nova based deployment. Defaults to no profile (any node can be picked).
hostname_format
a template for generating host names. This is similar to HostnameFormatDefault of a roles_data file and should use %index% to number the nodes. The default is %stackname%-<role name in lower case>-%index%.
instances
list of instances in the format accepted by deploy_instances workflow. This allows to tune parameters per instance.
Examples

Deploy one compute and one control with any profile:

- name: Compute
- name: Controller

HA deployment with two computes and profile matching:

- name: Compute
  count: 2
  profile: compute
  hostname_format: compute-%index%.example.com
- name: Controller
  count: 3
  profile: control
  hostname_format: controller-%index%.example.com

Advanced deployment with custom hostnames and parameters set per instance:

- name: Compute
  profile: compute
  instances:
    - hostname: compute-05.us-west.example.com
      nics:
        - network: ctlplane
          fixed_ip: 10.0.2.5
      traits:
        - HW_CPU_X86_VMX
    - hostname: compute-06.us-west.example.com
      nics:
        - network: ctlplane
          fixed_ip: 10.0.2.5
      traits:
        - HW_CPU_X86_VMX
- name: Controller
  profile: control
  instances:
    - hostname: controller-1.us-west.example.com
      swap_size_mb: 4096
    - hostname: controller-2.us-west.example.com
      swap_size_mb: 4096
    - hostname: controller-3.us-west.example.com
      swap_size_mb: 4096

deploy_roles workflow

The workflow tripleo.baremetal_deploy.v1.deploy_roles will accept the information from baremetal_deployment.yaml, convert it into the low-level format accepted by the deploy_instances workflow and call the deploy_instances workflow with it.

It will accept the following mandatory input:

roles
parsed baremetal_deployment.yaml file.

It will accept one optional input:

plan
plan/stack name, used for templating. Defaults to overcloud.

It will return the same output as the deploy_instances workflow plus:

environment
the content of the generated baremetal_environment.yaml file.
Examples

The examples from baremetal_deployment.yaml will be converted to:

- hostname: overcloud-compute-0
- hostname: overcloud-controller-0
- hostname: compute-0.example.com
  profile: compute
- hostname: compute-1.example.com
  profile: compute
- hostname: controller-0.example.com
  profile: control
- hostname: controller-1.example.com
  profile: control
- hostname: controller-2.example.com
  profile: control
- hostname: compute-05.us-west.example.com
  nics:
    - network: ctlplane
      fixed_ip: 10.0.2.5
  profile: compute
  traits:
    - HW_CPU_X86_VMX
- hostname: compute-06.us-west.example.com
  nics:
    - network: ctlplane
      fixed_ip: 10.0.2.5
  profile: compute
  traits:
    - HW_CPU_X86_VMX
- hostname: controller-1.us-west.example.com
  profile: control
  swap_size_mb: 4096
- hostname: controller-2.us-west.example.com
  profile: control
  swap_size_mb: 4096
- hostname: controller-3.us-west.example.com
  profile: control
  swap_size_mb: 4096

deploy_instances workflow

The workflow tripleo.baremetal_deploy.v1.deploy_instances is a thin wrapper around the corresponding metalsmith calls.

The following inputs are mandatory:

instances
list of requested instances in the format described in Instance format.
ssh_keys
list of SSH public keys contents to put on the machines.

The following inputs are optional:

ssh_user_name
SSH user name to create, defaults to heat-admin for compatibility.
timeout
deployment timeout, defaults to 3600 seconds.
concurrency
deployment concurrency - how many nodes to deploy at the same time. Defaults to 20, which matches introspection.
Instance format

The instance record format closely follows one of the metalsmith ansible role with only a few TripleO-specific additions and defaults changes.

Either or both of the following fields must be present:

hostname
requested hostname. It is used to identify the deployed instance later on. Defaults to name.
name
name of the node to deploy on. If hostname is not provided, name is also used as the hostname.

The following fields will be supported:

capabilities
requested node capabilities (except for profile and boot_option).
conductor_group
requested node’s conductor group. This is primary for the Edge case when nodes managed by the same Ironic can be physically separated.
nics
list of requested NICs, see metalsmith documentation for details. Defaults to {"network": "ctlplane"} which requests creation of a port on the ctlplane network.
profile
profile to use (e.g. compute, control, etc).
resource_class
requested node’s resource class, defaults to baremetal.
root_size_gb
size of the root partition in GiB, defaults to 49.
swap_size_mb
size of the swap partition in MiB, if needed.
traits
list of requested node traits.
whole_disk_image
boolean, whether to treat the image (overcloud-full.qcow2 or provided through the image field) as a whole disk image. Defaults to false.

The following fields will be supported, but the defaults should work for all but the most extreme cases:

image
file or HTTP URL of the root partition or whole disk image.
image_kernel
file or HTTP URL of the kernel image (partition images only).
image_ramdisk
file or HTTP URL of the ramdisk image (partition images only).
image_checksum
checksum of URL of checksum of the root partition or whole disk image.
Certificate authority configuration

If TLS is used in the undercloud, we need to make the nodes trust the Certificate Authority (CA) that signed the TLS certificates. If /etc/pki/ca-trust/source/anchors/cm-local-ca.pem exists, it will be included in the generated configdrive, so that the file is copied into the same location on target systems.

Outputs

The workflow will provide the following outputs:

ctlplane_ips
mapping of host names to their respective IP addresses on the ctlplane network.
instances

mapping of host names to full instance representations with fields:

node
Ironic node representation.
ip_addresses
mapping of network names to list of IP addresses on them.
hostname
instance hostname.
state
metalsmith instance state.
uuid
Ironic node uuid.

Also two subdicts of instances are provided:

existing_instances
only instances that already existed.
new_instances
only instances that were deployed.

Note

Instances are distinguised by their hostnames.

baremetal_environment.yaml

This file will serve as an output of the bare metal provisioning process. It will be fed into the overcloud deployment command. Its goal is to provide information for the deployed-server workflow.

The file will contain the HostnameMap generated from role names and hostnames, e.g.

parameter_defaults:
  HostnameMap:
    overcloud-controller-0: controller-1.us-west.example.com
    overcloud-controller-1: controller-2.us-west.example.com
    overcloud-controller-2: controller-3.us-west.example.com
    overcloud-novacompute-0: compute-05.us-west.example.com
    overcloud-novacompute-1: compute-06.us-west.example.com

undeploy_instances workflow

The workflow tripleo.baremetal_deploy.v1.undeploy_instances will take a list of hostnames and undeploy the corresponding nodes.

Novajoin replacement

The novajoin service is currently used to enroll nodes into IPA and provide them with TLS certificates. Unfortunately, it has hard dependencies on Nova, Glance and Metadata API, even though the information could be provided via other means. Actually, the metadata API cannot always be provided with Ironic (notably, it may not be available when using isolated provisioning networks).

A potential solution is to provide the required information via a configdrive, and make the nodes register themselves instead.

Alternatives

  • Do nothing, continue to rely on Nova and work around cases when it does match our goals well. See Problem Description for why it is not desired.
  • Avoid metalsmith, use OpenStack Ansible modules or Bifrost. They currently lack features (such as VIF attach/detach API) and do not have any notion of scheduling. Implementing sophisticated enough scheduling in pure Ansible seems a serious undertaking.
  • Avoid Mistral, drive metalsmith via Ansible. This is a potential future direction of this work, but currently it seems much simpler to call metalsmith Python API from Mistral actions. We would anyway need Mistral ( (or Ansible Tower) to drive Ansible, because we need some API level.
  • Remove Neutron in the same change. Would reduce footprint even further, but some operators may find the presence of an IPAM desirable. Also setting up static DHCP would increase the scope of the implementation substantially and complicate the upgrade even further.
  • Keep Glance but remove Nova. Does not make much sense, since Glance is only a requirement because of Nova. Ironic can deploy from HTTP or local file locations just as well.

Security Impact

  • Overcloud images will be exposed to unauthenticated users via HTTP. We need to communicate it clearly that secrets must not be built into images in plain text and should be delivered via configdrive instead. If it proves a problem, we can limit ourselves to providing images via local files.

    Note

    This issue exists today, as images are transferred via insecure medium in all supported deploy methods.

  • Removing two services from the undercloud will reduce potential attack surface and simplify audit.

Upgrade Impact

The initial version of this feature will be enabled for new deployments only.

The upgrade procedure will happen within a release, not between releases. It will go roughly as follows:

  1. Upgrade to a release where undercloud without Nova and Glance is supported.
  2. Make a full backup of the undercloud.
  3. Run openstack overcloud image upload to ensure that the overcloud-full images are available via HTTP(s).

The next steps will probably be automated via an Ansible playbook or a Mistral workflow:

  1. Mark deployed nodes protected in Ironic to prevent undeploying them by mistake.

  2. Run a Heat stack update replacing references to Nova servers with references to deployed servers. This will require telling Heat not to remove the instances.

  3. Mark nodes as managed by metalsmith (optional, but simplifies troubleshooting).

  4. Update node’s instance_info to refer to images over HTTP(s).

    Note

    This may require temporary moving nodes to maintenance.

  5. Run an undercloud update removing Nova and Glance.

Other End User Impact

  • Nova CLI will no longer be available for troubleshooting. It should not be a big problem in reality, as most of the problems it is used for are caused by using Nova itself.

    metalsmith provides a CLI tool for troubleshooting and advanced users. We will document using it for tasks like determining IP addresses of nodes.

  • It will no longer be possible to update images via Glance API, e.g. from GUI. It should not be a bit issue, as most of users use pre-built images. Advanced operators are likely to resort to CLI anyway.

  • No valid host found error will no longer be seen by operators. metalsmith provides more detailed errors, and is less likely to fail because of its scheduling approach working better with the undercloud case.

Performance Impact

  • A substantial speed-up is expected for deployments because of removing several layers of indirection. The new deployment process will also fail faster if the scheduling request cannot be satisfied.
  • Providing images via local files will remove the step of downloading them from Glance, providing even more speed-up for larger images.
  • An operator will be able to tune concurrency of deployment via CLI arguments or GUI parameters, other than nova.conf.

Other Deployer Impact

None

Developer Impact

New features for bare metal provisioning will have to be developed with this work in mind. It may mean implementing something in metalsmith code instead of relying on Nova servers or flavors, or Glance images.

Implementation

Assignee(s)

Primary assignee:
Dmitry Tantsur, IRC: dtantsur, LP: divius

Work Items

Phase 1 (Stein, technical preview):

  1. Update openstack overcloud image upload to copy images into the HTTP location and generate checksums.
  2. Implement deploy_instances workflow and undeploy_instances workflow.
  3. Update validations to not fail if Nova and/or Glance are not present.
  4. Implement deploy_roles workflow.
  5. Provide CLI commands for the created workflows.
  6. Provide an experimental OVB CI job exercising the new approach.

Phase 2 (T+, fully supported):

  1. Update openstack overcloud deploy to support the new workflow.
  2. Support scaling down.
  3. Provide a Novajoin replacement.
  4. Provide an upgrade workflow.
  5. Consider deprecating provisioning with Nova and Glance.

Dependencies

  • metalsmith library will be used for easier access to Ironic+Neutron API.

Testing

Since testing this feature requires bare metal provisioning, a new OVB job will be created for it. Initially it will be experimental, and will move to the check queue before the feature is considered fully supported.

Documentation Impact

Documentation will have to be reworked to explain the new deployment approach. Troubleshooting documentation will have to be updated.

References