Named veths ########### :date: 2015-08-31 22:00 :tags: lxc, veth, troubleshooting This spec aims to make troubleshooting openstack-ansible issues a more efficient process by using container names to build names for veth interfaces. Link to blueprint: * https://blueprints.launchpad.net/openstack-ansible/+spec/named-veths Problem description =================== All veth interfaces on the host are named using randomly generated names, such as `vethK070G4`. This can make troubleshooting container networking issues more challenging since it's difficult to trace a veth name to a particular network interface within the container. Proposed change =============== Names of veth interfaces should be unique and easily correlated to their containers. However, names of network interfaces have restrictions which must be handled carefully: * 16 characters maximum * Certain characters, like dashes (-) aren't allowed The random characters on the end of the container hostname could be used along with the interface name to form a veth name. As an example, a container called `aio1_utility_container-a9ef9551` could have two named veth interfaces: * a9ef9551_eth0 * a9ef9551_eth1 Alternatives ------------ Leave veth names as randomly generated by LXC. Playbook/Role impact -------------------- The veth names will only be adjusted on the host within the LXC configuration files themselves. Containers won't be affected. The playbooks don't use the veth names on the host for any actions. If veths are not cleaned up properly when a container stops (this is sometimes called 'dangling veths'), there's a chance that the container won't start until the dangling veth is manually removed with `ip link del `. Upgrade impact -------------- Upgrades should be unaffected. This change only adjusts the LXC container configuration files and doesn't change the running configuration of any of the containers. If a container is running and its LXC configuration file is adjusted to use named veths, it will only utilize those adjustments when it is restarted. If an upgrade happens to restart only a subset of the containers on the host, then only those containers will use named veths after they restart. Security impact --------------- This change shouldn't affect security. Performance impact ------------------ This change shouldn't affect performance. End user impact --------------- This change shouldn't affect end users. Deployer impact --------------- Users who deploy OpenStack should be able to troubleshoot network issues more efficiently. For example, if a user was having trouble reaching the nova API container, they could quickly see which veths were associated with the container. This would allow users to diagnose network problems with various tools, like ethtool and tcpdump, without digging into interface indexes or writing scripts. If a deployer wants to begin using named veth pairs immediately, then all containers must be restarted. This is because the LXC configuration files are adjusted on disk but running containers aren't adjusted. Developer impact ---------------- Much like the deployer impact above, this change could help developers diagnose issues within different containers more efficiently. Dependencies ------------ This spec has no known dependencies. Implementation ============== Assignee(s) ----------- Primary assignee: https://launchpad.net/~rackerhacker ``mhayden`` Work items ---------- * Update ansible playbooks to specify `lxc.network.veth.pair` in the main LXC configuration files as well as the interface .ini files Testing ======= * Do greenfield deployment and verify named veths * Do an upgrade between releases and verify named veths * Verify that both tests have no impact on running containers Documentation impact ==================== Documentation would be beneficial, especially around how this helps with troubleshooting issues. References ========== N/A