Retrieve NUMA node information¶
https://bugs.launchpad.net/ironic-python-agent/+bug/1635253
Today, The introspected data from the nodes does not provide information about the NUMA topology to the deployer. The deployer needs information on the associvity of NUMA nodes with the list of cores and NICs. These details would be required for configuring the nodes with DPDK aware NICs.
Problem description¶
In order to configure the nodes for better performance, selection of CPUs based on the NUMA topology becomes necessary. In case of nodes with DPDK aware NICs, the CPUs for poll mode driver (PMD) needs to be selected from the NUMA node associated with the DPDK NICs. If hyperthreading is enabled, then selection of the logical cores requires the knowledge on the siblings. This information shall be read from the Swift-stored data by the deployer and it shall help the deployer to manually configure the deployment parameters for better performance.
For example, to list available NUMA-aware NICs:
$ lspci -nn | grep Eth
82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
82:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
85:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
To obtain the NUMA node ID from the PCI device through the sysfs:
$ cat /sys/bus/pci/devices/0000\:85\:00.1/numa_node
1
To get the best performance we need to ensure that the CPU core and NIC are in
the same NUMA node. In the example above, the NIC with PCI address 85:00.1
is in NUMA node 1
. In order to achieve best performance the NIC should be
preferably used by the DPDK Poll Mode Drivers (PMD) running on the CPU cores
in NUMA node 1
. If not, the best performance is not guranteed as the
above-mentioned association would be random.
This spec shall ensure that the NUMA parameters are available for the deployer, in order to ensure PMDs uses the right logical CPUs for better performance.
Proposed change¶
The collected data will be stored as a blob in Swift. Future work may introduce an Inspector plug-in to further enhance the processing of the NUMA architecture data. A new optional Numa topology collector shall be used to fetch the below required information related to NUMA nodes.
List of NUMA nodes - Shall be fetched from
/sys/devices/system/node/node<numa_node_id>
List of CPU cores associated with each NUMA node - Shall be fetched from
/sys/devices/system/node/node<numa_node_id>/cpu<thread_id>/topology/core_id
List of thread_siblings for each core - Shall be fetched from
/sys/devices/system/node/node<numa_node_id>/cpu<thread_id>
NUMA Node ID for network interfaces - Extract the numa node for the NIC from
/sys/class/net/<interface name>/device/numa_node
RAM available for each NUMA node - Shall be fetched from
/sys/devices/system/node/node<numa_node_id>/meminfo
Alternatives¶
Another option would be to allow the deployment with the default parameters and then identify the actual values from the compute nodes. Then re-configure the correct parameters and re-deploy. The proposed changes will provide the deployer with the NUMA topology details during the introspection stage and there by avoids the need for redeployment
Data model impact¶
The data structure for storing the information on NUMA nodes, CPUs, thread siblings, ram and nics shall be:
{
"numa_topology": {
"ram": [{"numa_node": <numa_node_id>, "size_kb": <memory_in_kb>}, ...],
"cpus": [
{"cpu": <cpu_id>, "numa_node": <numa_node_id>, "thread_siblings": [<list of sibling threads>]},
...,
],
"nics": [
{"name": "<network interface name>", "numa_node": <numa_node_id>},
...,
]
}
}
}
- Where:
ram
a mapping from memory available to a NUMA nodecpus
a mapping from physical CPU ID to a NUMA node and a list of sibling threadsnics
a mapping from NIC names to NUMA node
Example:
{
"numa_topology": {
"ram": [
{"numa_node":0, "size_kb": 2097152},
{"numa_node":1, "size_kb": 1048576}
],
"cpus": [
{"cpu": 0, "numa_node": 0, "thread_siblings": [0,1]},
{"cpu": 1, "numa_node": 0, "thread_siblings": [2,3]},
...,
{"cpu": 0, "numa_node": 1, "thread_siblings": [16,17]},
{"cpu": 1, "numa_node": 1, "thread_siblings": [18,19]},
...,
],
"nics": [
{"name": "ixgbe0", "numa_node": 0},
{"name": "ixgbe1", "numa_node": 1}
]
}
}
}
Note
In cpus
, cpu
and numa_node
together forms a unique value, as
cpu_id
is specific to a NUMA node. And the thread id specified in
thread_siblings
will be unique across NUMA nodes.
HTTP API impact¶
None
Client (CLI) impact¶
None
Ironic python agent impact¶
The changes proposed above will be implemented in IPA.
Performance and scalability impact¶
None.
Security impact¶
None
Deployer impact¶
The deployer shall enable the optional Numa topology collector via
ipa-inspection-collectors
kernel argument. The deployer will be able to get
the information about memory per NUMA node, CPUs, thread siblings and nics,
which could be useful in configuring the system for better performance.
Developer impact¶
None
Implementation¶
Assignee(s)¶
karthiks
Work Items¶
Implement the collector to fetch the NUMA topology information in IPA
Dependencies¶
None
Testing¶
Unit test cases will be added.