Remove resource information from os-hypervisors
API¶
https://blueprints.launchpad.net/nova/+spec/modernize-os-hypervisors-api
The os-hypervisors
API is around since the early days of nova. Back in
those halcyon days, it provided a nice way to get a quick summary of the
resource usage of individual compute nodes in your deployment. Today, however,
it’s a shell of itself. The more detailed information it returns is
hypervisor-specific and frequently wrong, especially with advanced features
like CPU pinning or file-based memory. With the elevation of placement to its
rightful place as lord of (almost) all things resource’y, the os-hypervisor
API needs to slim down significantly.
Problem description¶
The majority of this spec is focused on changes to the
/os-hypervisors/detail
API. Consider the output of a typical GET
/os-hypervisors/detail
call, as taken from the API ref [1]:
{
"hypervisors": [
{
"cpu_info": {
"arch": "x86_64",
"model": "Nehalem",
"vendor": "Intel",
"features": [
"pge",
"clflush"
],
"topology": {
"cores": 1,
"threads": 1,
"sockets": 4
}
},
"current_workload": 0,
"status": "enabled",
"state": "up",
"disk_available_least": 0,
"host_ip": "1.1.1.1",
"free_disk_gb": 1028,
"free_ram_mb": 7680,
"hypervisor_hostname": "host1",
"hypervisor_type": "fake",
"hypervisor_version": 1000,
"id": 2,
"local_gb": 1028,
"local_gb_used": 0,
"memory_mb": 8192,
"memory_mb_used": 512,
"running_vms": 0,
"service": {
"host": "host1",
"id": 6,
"disabled_reason": null
},
"vcpus": 2,
"vcpus_used": 0
}
],
"hypervisors_links": [
{
"href": "http://openstack.example.com/v2.1/6f70656e737461636b20342065766572/os-hypervisors/detail?limit=1&marker=2",
"rel": "next"
}
]
}
The fields here broadly fall into three categories: useful but duplicated in the summary (non-detailed) view, useful and unique to the detailed view, and not useful. First, the useful but duplicated fields. These should remain:
id
status
state
hypervisor_hostname
Next, the useful fields unique to the detailed view. These should also remain:
host_ip
hypervisor_type
hypervisor_version
service
Finally, the useless fields. There are varied reasons their uselessness, described below, but all should be removed:
current_workload
This tracks “the number of tasks the hypervisor is responsible for” and it “will be equal or greater than the number of active VMs on the system (it can be greater when VMs are being deleted and the hypervisor is still cleaning up)” [2]. This information is easily calculated by listing active and deleted instances.
cpu_info
Useful at face value but the only thing relevant for scheduling purposes are the CPU architecture and CPU features, all of which are already handled by placement trait requests. The topology field is an oddity that should likely never have been added. It’s not usable in scheduling and is possibly wrong, given it doesn’t reflect offline CPUs or those not available to nova due to configuration, and it doesn’t handle non-uniform CPU topologies where there are e.g. more cores on one socket than another. If the operator wants this information, they can simply inspect the host like they would have to do to identify e.g. the specifics of PCI devices or storage devices.
free_disk_gb
,local_gb
,local_gb_used
💩 Almost always wrong if shared storage is in use and doesn’t take overcommit into account. Use placement.
disk_available_least
Reflects the estimated available disk space on the hypervisor if all instances on the host were to use all their allocated disk. This can go negative if disk overcommit is enabled or if an instance is force migrated to a host, bypassing the scheduler. This value is hard to use and frequently misunderstood by end-users.
free_ram_mb
,memory_mb
,memory_mb_used
Doesn’t take overcommit or non-default pagesizes into account. Use placement.
vcpus
,vcpus_used
Doesn’t take overcommit or PCPU inventory into account. Use placement.
running_vms
Easily figured out by filtering running instances by host (admin-only, like this API).
In addition to the changes to the /os-hypervisors/detail
API, there are
also two other APIs that appear to have outlived their usefulness:
/os-hypervisors/statistics
, which provides summary information of the
above, and /os-hypervisors/{hypervisor_id}/uptime
, which provides an entire
API to a single figure. The former can be removed entirely in favour of
placement, while the latter can be replaced by a new uptime
field on the
/os-hypervisors/{hypervisor_id}
API.
Use Cases¶
As a user, I don’t want to see misleading information reported from my API.
Proposed change¶
Remove the resource-related fields from the output of the
/os-hypervisors/detail
API and remove the /os-hypervisors/statistics
API in its entirety.
Alternatives¶
We could document the incorrect nature of these APIs. This is less desirable since people don’t read documentation.
Data model impact¶
None.
REST API impact¶
Starting from the new API microversion, the /os-hypervisors/detail
API will
no longer include the following fields in its response: cpu_info
,
free_disk_gb
, local_gb
, local_gb_used
, disk_available_least
,
free_ram_mb
, memory_mb
, memory_mb_used
, vcpus
, vcpus_used
,
and running_vms
.
The /os-hypervisors/statistics
API contains summary information of the
above fields and will be removed entirely, returning a HTTP 404 (Not Found) on
the new API microversion.
The uptime information shown by the /os-hypervisors/{hypervisor_id}/uptime
API doesn’t warrant its own API and will also return a HTTP 404 (Not Found) on
the new API microversion. This information will be accessible via a new
uptime
field on responses from the /os-hypervisors/{hypervisor_id}
API.
Security impact¶
None.
Notifications impact¶
None.
Other end user impact¶
The clients will need to be updated. Documentation referencing these APIs will
need to be updated with recommendations to look at placement or other APIs
instead. Regarding the changes to the /os-hypervisors/detail
API:
The
free_disk_gb
,local_gb
,local_gb_used
,free_ram_mb
,memory_mb
,memory_mb_used
,vcpus
andvcpus_used
values can be identified using a combination ofopenstack resource provider inventory list
andopenstack resource provider usage show
. For example:$ openstack hypervisor show devstack-1 \ -c local_gb -c local_gb_used -c free_disk_gb \ -c memory_mb -c memory_mb_used -c free_ram_mb \ -c vcpus -c vcpus_used +----------------+-------+ | Field | Value | +----------------+-------+ | local_gb | 18 | | local_gb_used | 1 | | free_disk_gb | 19 | | memory_mb | 16035 | | memory_mb_used | 1024 | | free_ram_mb | 15011 | | vcpus | 12 | | vcpus_used | 1 | +----------------+-------+ $ openstack resource provider inventory list bde27f9d-1249-446f-ae14-45f6ff3e63d5 +----------------+------------------+----------+----------+----------+-----------+-------+ | resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | +----------------+------------------+----------+----------+----------+-----------+-------+ | VCPU | 16.0 | 1 | 12 | 0 | 1 | 12 | | MEMORY_MB | 1.5 | 1 | 16035 | 512 | 1 | 16035 | | DISK_GB | 1.0 | 1 | 19 | 0 | 1 | 19 | +----------------+------------------+----------+----------+----------+-----------+-------+ $ openstack resource provider usage show bde27f9d-1249-446f-ae14-45f6ff3e63d5 +----------------+-------+ | resource_class | usage | +----------------+-------+ | VCPU | 1 | | MEMORY_MB | 512 | | DISK_GB | 1 | +----------------+-------
The
running_vms
value can be identified using by filter instances by host usingopenstack server list --host <HOST>
. For example:$ openstack hypervisor show devstack-1 -c running_vms +-------------+-------+ | Field | Value | +-------------+-------+ | running_vms | 1 | +-------------+-------+ $ openstack server list --host devstack-1 -c ID -f yaml | wc -l 1
Note
This is not a 1:1 replacement since the
running_vms
setting will track all VMs running on the hypervisor, including those not managed by nova. However, having VMs not managed by nova on a hypervisor is considered a misconfiguration and is irrelevant for scheduling purposes.The
cpu_info.arch
andcpu_info.features
values are published as traits and can be inspected usingopenstack resource provider trait list
. For example:$ openstack hypervisor show devstack-1 -f yaml -c cpu_info cpu_info: '{"arch": "x86_64", "model": "IvyBridge-IBRS", "vendor": "Intel", "topology": {"cells": 2, "sockets": 1, "cores": 3, "threads": 2}, "features": ["xsaveopt", "erms", "ssbd", "arch-capabilities", "nx", "cx16", "ht", "mca", "tsc-deadline", "amd-ssbd", "pcid", "pse", "ss", "syscall", "md-clear", "tsc_adjust", "mmx", "rdtscp", "f16c", "fxsr", "lahf_lm", "spec-ctrl", "smep", "pse36", "vme", "de", "sse", "xsave", "clflush", "cmov", "msr", "pat", "aes", "hypervisor", "mtrr", "sep", "fsgsbase", "tsc", "sse2", "apic", "pdpe1gb", "cx8", "umip", "vmx", "pae", "skip-l1dfl-vmentry", "popcnt", "ssse3", "avx", "pclmuldq", "x2apic", "lm", "stibp", "fpu", "ibpb", "rdrand", "sse4.1", "pni", "pge", "sse4.2", "pschange-mc-no", "mce", "arat"]}' $ openstack --os-placement-api-version 1.8 \ resource provider trait list bde27f9d-1249-446f-ae14-45f6ff3e63d5 | grep CPU | HW_CPU_X86_AMD_SVM | | HW_CPU_X86_SSE2 | | HW_CPU_X86_SSE | | HW_CPU_X86_SVM | | HW_CPU_HYPERTHREADING | | HW_CPU_X86_MMX |
The
disk_available_least
,cpu_info.model
,cpu_info.vendor
andcpu_info.topology
values are not relevant for scheduling and therefore have no direct replacement in placement or another API. They can, however, be identified through inspection of the host.
Horizon will need to be updated to talk to placement or use this API with an
older microversion. Similarly, any users of the
/os-hypervisors/{hypervisor_id}/uptime
API will need to use the
/os-hypervisors/{hypervisor_id}
API instead.
Performance Impact¶
None.
Other deployer impact¶
None.
Developer impact¶
None.
Upgrade impact¶
None.
Implementation¶
Assignee(s)¶
- Primary assignee:
stephenfinucane
- Other contributors:
None
Feature Liaison¶
None.
Work Items¶
Update the APIs in a new microversion.
Update the documentation to remove references to these deprecated APIs.
Update the clients to reflect the deprecations.
Dependencies¶
None.
Testing¶
Unit and functional tests. Tempest tests will need to be updated to cap against the latest microversion to support these APIs.
Documentation Impact¶
References to the APIs will need to be removed or updated.
References¶
None.
History¶
Release Name |
Description |
---|---|
Wallaby |
Introduced |
Wallaby |
Updated to remove references to policy changes, which were deferred. |