Network Data v2 - node ports and node network config

With “Network Data v2” the goal is to move management of network resources out of the heat stack. The schema spec [1] talked about the network_data.yaml format and managing networks, segments and subnets. This spec follows up with node ports for composable networks and moving the node network configuration action to the baremetal/network configuration workflow.

Problem description

Applying a network change on day 2, currently requires a full stack update since network resources such as ports are managed by heat. It has also been problematic to create ports for large scale deployments; neutron on the single node undercloud gets overwhelmed and it is difficult to throttle port creation in Heat. As an early indication on the performance of port creation with the proposed ansible module:

Performance stats: 100 nodes x 3 networks = 300 ports

        4xCPU 1.8 GHz (8GB)             8x CPU 2.6 GHz (12GB)
        -------------------  --------------------------------
Concurr:                 10          20         10          4
........     ..............   .........  .........  .........
Create       real 5m58.006s   1m48.518s  1m51.998s  1m25.022s
Delete:      real 4m12.812s   0m47.475s  0m48.956s  1m19.543s
Re-run:      real 0m19.386s    0m4.389s   0m4.453s   0m4.977s

Proposed Change

Extend the baremetal provisioning workflow that runs before overcloud deployment to also create ports for composable networks. The baremetal provisioning step already create ports for the provisioning network. Moving the management of ports for composable networks to this workflow will consolidate all port management into one workflow.

Also make baremetal provisioning workflow execute the tripleo-ansible tripleo_network_config role to configure node networking after node provisioning.

The deploy workflow would be:

  1. Operator defines composable networks in network data YAML file.

  2. Operator provisions composable networks by running the openstack overcloud network provision command, providing the network data YAML file as input.

  3. Operator defines roles and nodes in the baremetal deployment YAML file. This YAML also defines the networks for each role.

  4. Operator deploys baremetal nodes by running the openstack overcloud node provision command. This step creates ports in neutron, and also configures networking; including composable networks; on the nodes using ansible role to apply network config with os-net-config [2].

  5. Operator deploys heat stack including the environment files produced by the commands executed in the previous steps by running the openstack overcloud deploy command.

  6. Operator executes config-download to install and configure openstack on the overcloud nodes. (optional - only if overcloud deploy command executed with ``-stack-only``)

Implementation

Assignee(s)

Primary assignee:

Harald Jensås <hjensas@redhat.com>

Approver(s)

Primary approver:

TODO

Implementation Details

The baremetal YAML definition will be extended, adding the networks and the network_config keys in role defaults as well as per-instance to support fixed_ip addressing, manually pre-created port resource and per-node network configuration template.

The networks will replace the current nic key, until the nic key is deprecated either can be used but not both at the same time. Networks in networks will support a boolean key vif which indicate if the port should be attached in Ironic or not. If no network with vif: true is specified an implicit one for the control plane will be appended:

- network: ctlplane
  vif: true

For networks with vif: true, ports will be created by metalsmith. For networks with vif: false (or vif not specified) the workflow will create neutron ports based on the YAML definition.

The neutron ports will initially be tagged with the stack name and the instance hostname, these tags are used for idempotency. The ansible module managing ports will get all ports with the relevant tags and then add/remove ports based on the expanded roles defined in the Baremetal YAML definition. (The hostname and stack_name tags are also added to ports created with heat in this tripleo-heat-templates change [4], to enable adoption of neutron ports created by heat for the upgrade scenario.)

Additionally the ports will be tagged with the ironic node uuid when this is available. Full set of tags are shown in the example below.

{
  "port": {
    "name": "controller-1-External",
    "tags": ["tripleo_ironic_uuid=<IRONIC_NODE_UUID>",
             "tripleo_hostname=controller-1",
             "tripleo_stack_name=overcloud"],
  }
}

Note

In deployments where baremetal nodes have multiple physical NIC’s multiple networks can have vif: true, so that VIF attach in ironic and proper neutron port binding happens. In a scenario where neutron on the Undercloud is managing the switch this would enable automation of the Top-of-Rack switch configuration.

Mapping of the port data for overcloud nodes will go into a NodePortMap parameter in tripleo-heat-tempaltes. The map will contain submaps for each node, keyed by the node name. Initially the NodePortMap will be consumed by alternative fake-port OS::TripleO::{{role.name}}::Ports::{{network.name}}Port resource templates. In the final implementation the environment file created can be extended and the entire OS::TripleO::{{role.name}} resource can be replaced with a template that references parameter in the generated environment directly, i.e a re-implemented puppet/role.role.j2.yaml without the server and port resources. The NodePortMap will be added to the overcloud-baremetal-deployed.yaml created by the workflow creating the overcloud node port resources.

Network ports for vif: false networks, will be managed by a new ansible module tripleo_overcloud_network_ports, the input for this role will be a list of instance definitions as generated by the tripleo_baremetal_expand_roles ansible module. The tripleo_baremetal_expand_roles ansible module will be extended to add network/subnet information from the baremetal deployment YAML definition.

The baremetal provision workflow will be extended to write a ansible inventory, we should try extend tripleo-ansible-inventory so that the baremetal provisioning workflow can re-use existing code to create the inventory. The inventory will be used to configure networking on the provisioned nodes using the triple-ansible tripleo_network_config ansible role.

Already Deployed Servers

The Baremetal YAML definition will be used to describe the pre-deployed servers baremetal deployment. In this scenario there is no Ironic node to update, no ironic UUID to add to a port’s tags and no ironic node to attach VIFs to.

All ports, including the ctlplane port will be managed by the tripleo_overcloud_network_ports ansible module. The Baremetal YAML definition for a deployment with pre-deployed servers will have to include an instance entry for each pre-deployed server. This entry will have the managed key set to false.

It should be possible for an already deployed server to have a management address that is completely separate from the tripleo managed addreses. The Baremetal YAML definition can be extended to carry a management_ip field for this purpose. In the case no managment address is available the ctlplane network entry for pre-deployed instances must have fixed_ip configured.

The deployment workflow will short circuit the baremetal provisioning of managed: false instances. The Baremetal YAML definition can define a mix of already deployed server instances, and instances that should be provisioned via metalsmith. See Example: Baremetal YAML for Already Deployed Servers.

YAML Examples

Example: Baremetal YAML definition with defaults properties
- name: Controller
  count: 1
  hostname_format: controller-%index%
  defaults:
    profile: control
    network_config:
      template: templates/multiple_nics/multiple_nics.j2
      physical_bridge_name: br-ex
      public_interface_name: nic1
      network_deployment_actions: ['CREATE']
      net_config_data_lookup: {}
    networks:
      - network: ctlplane
        vif: true
      - network: external
        subnet: external_subnet
      - network: internal_api
        subnet: internal_api_subnet
      - network: storage
        subnet: storage_subnet
      - network: storage_mgmt
        subnet: storage_mgmt_subnet
      - network: Tenant
        subnet: tenant_subnet
- name: Compute
  count: 1
  hostname_format: compute-%index%
  defaults:
    profile: compute
    network_config:
      template: templates/multiple_nics/multiple_nics.j2
      physical_bridge_name: br-ex
      public_interface_name: nic1
      network_deployment_actions: ['CREATE']
      net_config_data_lookup: {}
    networks:
      - network: ctlplane
        vif: true
      - network: internal_api
        subnet: internal_api_subnet
      - network: tenant
        subnet: tenant_subnet
      - network: storage
        subnet: storage_subnet
Example: Baremetal YAML definition with per-instance overrides
- name: Controller
  count: 1
  hostname_format: controller-%index%
  defaults:
    profile: control
    network_config:
      template: templates/multiple_nics/multiple_nics.j2
      physical_bridge_name: br-ex
      public_interface_name: nic1
      network_deployment_actions: ['CREATE']
      net_config_data_lookup: {}
      bond_interface_ovs_options:
    networks:
      - network: ctlplane
        vif: true
      - network: external
        subnet: external_subnet
      - network: internal_api
        subnet: internal_api_subnet
      - network: storage
        subnet: storage_subnet
      - network: storage_mgmt
        subnet: storage_mgmt_subnet
      - network: tenant
        subnet: tenant_subnet
  instances:
    - hostname: controller-0
      name: node00
      networks:
        - network: ctlplane
          vif: true
        - network: internal_api:
          fixed_ip: 172.21.11.100
    - hostname: controller-1
      name: node01
      networks:
        External:
          port: controller-1-external
    - hostname: controller-2
      name: node02
- name: ComputeLeaf1
  count: 1
  hostname_format: compute-leaf1-%index%
  defaults:
    profile: compute-leaf1
    networks:
      - network: internal_api
        subnet: internal_api_subnet
      - network: tenant
        subnet: tenant_subnet
      - network: storage
        subnet: storage_subnet
  instances:
    - hostname: compute-leaf1-0
      name: node03
      network_config:
        template: templates/multiple_nics/multiple_nics_dpdk.j2
        physical_bridge_name: br-ex
        public_interface_name: nic1
        network_deployment_actions: ['CREATE']
        net_config_data_lookup: {}
        num_dpdk_interface_rx_queues: 1
      networks:
        - network: ctlplane
          vif: true
        - network: internal_api
          fixed_ip: 172.21.12.105
        - network: tenant
          port: compute-leaf1-0-tenant
        - network: storage
          subnet: storage_subnet
Example: Baremetal YAML for Already Deployed Servers
- name: Controller
  count: 3
  hostname_format: controller-%index%
  defaults:
    profile: control
    network_config:
      template: templates/multiple_nics/multiple_nics.j2
    networks:
      - network: ctlplane
      - network: external
        subnet: external_subnet
      - network: internal_api
        subnet: internal_api_subnet
      - network: storage
        subnet: storage_subnet
      - network: storage_mgmt
        subnet: storage_mgmt_subnet
      - network: tenant
        subnet: tenant_subnet
    managed: false
  instances:
    - hostname: controller-0
      networks:
        - network: ctlplane
          fixed_ip: 192.168.24.10
    - hostname: controller-1
      networks:
        - network: ctlplane
          fixed_ip: 192.168.24.11
    - hostname: controller-2
      networks:
        - network: ctlplane
          fixed_ip: 192.168.24.12
- name: Compute
  count: 2
  hostname_format: compute-%index%
  defaults:
    profile: compute
    network_config:
      template: templates/multiple_nics/multiple_nics.j2
    networks:
      - network: ctlplane
      - network: internal_api
        subnet: internal_api_subnet
      - network: tenant
        subnet: tenant_subnet
      - network: storage
        subnet: storage_subnet
  instances:
    - hostname: compute-0
      managed: false
      networks:
        - network: ctlplane
          fixed_ip: 192.168.24.100
    - hostname: compute-1
      managed: false
      networks:
        - network: ctlplane
          fixed_ip: 192.168.24.101
Example: NodeNetworkDataMappings
NodePortMap:
  controller-0:
    ctlplane:
      ip_address: 192.168.24.9 (2001:DB8:24::9)
      ip_subnet: 192.168.24.9/24 (2001:DB8:24::9/64)
      ip_address_uri: 192.168.24.9 ([2001:DB8:24::9])
    internal_api:
      ip_address: 172.18.0.9 (2001:DB8:18::9)
      ip_subnet: 172.18.0.9/24 (2001:DB8:18::9/64)
      ip_address_uri: 172.18.0.9 ([2001:DB8:18::9])
    tenant:
      ip_address: 172.19.0.9 (2001:DB8:19::9)
      ip_subnet: 172.19.0.9/24 (2001:DB8:19::9/64)
      ip_address_uri: 172.19.0.9 ([2001:DB8:19::9])
  compute-0:
    ctlplane:
      ip_address: 192.168.24.15 (2001:DB8:24::15)
      ip_subnet: 192.168.24.15/24 (2001:DB8:24::15/64)
      ip_address_uri: 192.168.24.15 ([2001:DB8:24::15])
    internal_api:
      ip_address: 172.18.0.15 (2001:DB8:18::1)
      ip_subnet: 172.18.0.15/24 (2001:DB8:18::1/64)
      ip_address_uri: 172.18.0.15 ([2001:DB8:18::1])
    tenant:
      ip_address: 172.19.0.15 (2001:DB8:19::15)
      ip_subnet: 172.19.0.15/24 (2001:DB8:19::15/64)
      ip_address_uri: 172.19.0.15 ([2001:DB8:19::15])
Example: Ansible inventory
Controller:
  vars:
    role_networks:
      - External
      - InternalApi
      - Tenant
    role_networks_lower:
      External: external
      InternalApi: internal_api
      Tenant: tenant
    networks_all:
      - External
      - InternalApi
      - Tenant
    neutron_physical_bridge_name: br-ex
    neutron_public_interface_name: nic1
    tripleo_network_config_os_net_config_mappings: {}
    network_deployment_actions: ['CREATE', 'UPDATE']
    ctlplane_subnet_cidr: 24
    ctlplane_mtu: 1500
    ctlplane_gateway_ip: 192.168.24.254
    ctlplane_dns_nameservers: []
    dns_search_domains: []
    ctlplane_host_routes: {}
    internal_api_cidr: 24
    internal_api_gateway_ip: 172.18.0.254
    internal_api_host_routes: []
    internal_api_mtu: 1500
    internal_api_vlan_id: 20
    tenant_cidr: 24
    tenant_api_gateway_ip: 172.19.0.254
    tenant_host_routes: []
    tenant_mtu: 1500
  hosts:
    controller-0:
      ansible_host: 192.168.24.9
      ctlplane_ip: 192.168.24.9
      internal_api_ip: 172.18.0.9
      tenant_ip: 172.19.0.9
Compute:
  vars:
    role_networks:
      - InternalApi
      - Tenant
    role_networks_lower:
      InternalApi: internal_api
      Tenant: tenant
    networks_all:
      - External
      - InternalApi
      - Tenant
    neutron_physical_bridge_name: br-ex
    neutron_public_interface_name: nic1
    tripleo_network_config_os_net_config_mappings: {}
    network_deployment_actions: ['CREATE', 'UPDATE']
    ctlplane_subnet_cidr: 24
    ctlplane_mtu: 1500
    ctlplane_gateway_ip: 192.168.25.254
    ctlplane_dns_nameservers: []
    dns_search_domains: []
    ctlplane_host_routes: {}
    internal_api_cidr: 24
    internal_api_gateway_ip: 172.18.1.254
    internal_api_host_routes: []
    internal_api_mtu: 1500
    internal_api_vlan_id: 20
    tenant_cidr: 24
    tenant_api_gateway_ip: 172.19.1.254
    tenant_host_routes: []
    tenant_mtu: 1500
  hosts:
    compute-0:
      ansible_host: 192.168.25.15
      ctlplane_ip: 192.168.25.15
      internal_ip: 172.18.1.15
      tenant_ip: 172.19.1.15

TODO

  • Constraint validation, for example BondInterfaceOvsOptions uses allowed_pattern: ^((?!balance.tcp).)*$ to ensure balance-tcp bond mode is not used, as it is known to cause packet loss.

Work Items

  1. Write ansible inventory after baremetal provisioning

    Create an ansible inventory, similar to the inventory created by config- download. The ansible inventory is required to apply network configuration to the deployed nodes.

    We should try to extend tripleo-ansible-inventory so that the baremetal provisioning workflow can re-use existing code to create the inventory.

    It is likely that it makes sense for the workflow to also run the tripleo-ansible role tripleo_create_admin to create the tripleo-admin ansible user.

  2. Extend baremetal provisioning workflow to create neutron ports and update the ironic node extra field with the tripleo_networks map.

  3. The baremetal provisioning workflow needs a pre-deployed-server option that cause it to not deploy baremetal nodes, only create network ports. When this option is used the baremetal deployment YAML file will also describe the already provisioned nodes.

  4. Apply and validate network configuration using the triple-ansible tripleo_network_config ansible role. This step will be integrated in the provisioning command.

  5. Disable and remove management of composable network ports in tripleo-heat-templates.

  6. Change the Undercloud and Standalone deploy to apply network configuration prior to the creating the ephemeral heat stack using the tripleo_network_config ansible role.

Testing

Multinode OVB CI job’s with network-isolation will be updated to test the new workflow.

Upgrade Impact

During upgrade switching to use network ports managed outside of the heat stack the PortDeletionPolicy must be set to retain during the update/upgrade prepare step, so that the existing neutron ports (which will be adopted by the pre-heat port management workflow) are not deleted when running the update/ upgrade converge step.

Moving node network configuration out of tripleo-heat-templates will require manual (or scripted) migration of settings controlled by heat template parameters to the input file used for baremetal/network provisioning. At least the following parameters are affected:

  • NeutronPhysicalBridge

  • NeutronPublicInterface

  • NetConfigDataLookup

  • NetworkDeploymentActions

Parameters that will be deprecated:

  • NetworkConfigWithAnsible

  • {{role.name}}NetworkConfigTemplate

  • NetworkDeploymentActions

  • {{role.name}}NetworkDeploymentActions

  • BondInterfaceOvsOptions

  • NumDpdkInterfaceRxQueues

  • {{role.name}}LocalMtu

  • NetConfigDataLookup

  • DnsServers

  • DnsSearchDomains

  • ControlPlaneSubnetCidr

  • HypervisorNeutronPublicInterface

  • HypervisorNeutronPhysicalBridge

The environment files used to select one of the pre-defined nic config templates will no longer work. The template to use must be set in the YAML defining the baremetal/network deployment. This affect the following environment files:

  • environments/net-2-linux-bonds-with-vlans.j2.yaml

  • environments/net-bond-with-vlans.j2.yaml

  • environments/net-bond-with-vlans-no-external.j2.yaml

  • environments/net-dpdkbond-with-vlans.j2.yaml

  • environments/net-multiple-nics.j2.yaml

  • environments/net-multiple-nics-vlans.j2.yaml

  • environments/net-noop.j2.yaml

  • environments/net-single-nic-linux-bridge-with-vlans.j2.yaml

  • environments/net-single-nic-with-vlans.j2.yaml

  • environments/net-single-nic-with-vlans-no-external.j2.yaml

Documentation Impact

The documentation effort is heavy and will need to be incrementally updated. As a minumum, a separate page explaining the new process must be created.

The TripleO docs will need updates in many sections, including:

Alternatives

  1. Not changing how ports are created

    In this case we keep creating the ports with heat, the do nothing alternative.

  2. Create a completely separate workflow for composable network ports

    A separate workflow that can run before/after node provisioning. It can read the same YAML format as baremetal provisioning, or it can have it’s own YAML format.

    The problem with this approach is that we loose the possibility to store relations between neutron-port and baremetal node in a database. As in, we’d need our own database (a file) maintaining the relationships.

    Note

    We need to implement this workflow anyway for a pre-deployed server scenario, but instead of a completely separate workflow the baremetal deploy workflow can take an option to not provision nodes.

  3. Create ports in ironic and bind neutron ports

    Instead of creating ports unknown to ironic, create ports for the ironic nodes in the baremetal service.

    The issue is that ironic does not have a concept of virtual port’s, so we would have to either add this support in ironic, switch TripleO to use neutron trunk ports or create fake ironic ports that don’t actually reflect NICs on the baremetal node. (This abandoned ironic spec [3] discuss one approach for virtual port support, but it was abandoned in favor of neutron trunk ports.)

    With each PTG there is a re-occurring suggestion to replace neutron with a more light weight IPAM solution. However, the effort to actually integrate it properly with ironic and neutron for composable networks probably isn’t time well spent.

References