Add Capacity Factors to scheduler get pools API¶
https://blueprints.launchpad.net/cinder/+spec/add-capacity-factors-to-pool
This spec proposes adding the new capacity factors in the response to the get-pools API request.
Problem description¶
Typically there is some monitoring and alerting in place against a cinder deployment and it’s storage array to warn/alert the operator of the cloud that they are running out of storage, or are out of storage. The only mechanism in Cinder now to see the capacity factors of the backends that cinder is managing is to call the cinder api to get-pools. This returns some basic information about the backend/pools. But it doesn’t really tell the whole story.
The new capacity factors are various aspects of the capacity management that cinder calculates on each provisioning request. Each of those factors are used at runtime to decide if a volume can land on a particular pool. Those factors decide if a pool is full or not depending on several factors including,
thin or thick provisioning support
total capacity reported by the storage array
free capacity reported by the storage array
reserved percentage
max over subscription ratio
From those configuration settings and pool capabilities a wider set of factors is calculated to determine the capacity availability on a backend/pool. All of those factors determine if a backend/pool has space, is running out of space, or is out of space for cinder to use.
Use Cases¶
As an operator of cinder, I want cinder to report all of the factors it uses and calculates to determine if cinder’s backend/pools have space available. Only cinder has all information and the calculate_capacity_factors() can create the information the scheduler uses. It should also report that in the pools api request, so tooling for monitoring and alerting can be in sync with the cinder scheduler.
Proposed change¶
Add the dictionary that is created from calculate_capacity_factors() to each pool information in the get-pools response. The capacity factors are not the same thing as the capabilities being reported now. The capacity factors are the breakdown of the capacity calculations based on the driver’s reported capabilities seen by the backend storage. Depending on the capabilities in each pool such as thin_provisioning_support and thick_provisioning_support, factors will be calculated for each thin and thick if the pool supports.
{
"total_capacity": 5120.0,
"free_capacity": 4616,
"reserved_capacity": 1024,
"total_reserved_available_capacity": 4096,
"max_over_subscription_ratio": None,
"total_available_capacity": 4096,
"provisioned_capacity": 500,
"calculated_free_capacity": 3596,
"virtual_free_capacity": 3596,
"free_percent": 87.79296875,
"provisioned_ratio": 0.1220703125,
"provisioned_type": "thick"
}
Alternatives¶
The calculate_capacity_factors() function in utils.py can get copy/pasted into some external tooling that can do the alerting, but it’s subject to getting out of sync with cinder’s version. Therefore Cinder itself should report all of those capacity factors for each backend/pool that it calculates. Then cinder will be the definitive answer on the capacity it sees and can use.
Data model impact¶
None
REST API impact¶
A new microversion would be required and if compatible cinder would return the capacity_factors dictionary in the pools response.
GET /v3/{project_id}/scheduler-stats/get_pools?detail=True
{
"pools": [
{
"name": "pool1",
"capabilities": {
"updated": "2014-10-28T00:00:00-00:00",
"total_capacity_gb": 1024,
"free_capacity_gb": 100,
"volume_backend_name": "pool1",
"reserved_percentage": 5,
"driver_version": "1.0.0",
"storage_protocol": "iSCSI",
"QoS_support": false,
"thin_provisioning_support": true,
"thick_provisioning_support": true,
},
"capacity_factors": [
{
"total_capacity": 1024,
"free_capacity": 100,
"reserved_capacity": 51,
"total_reserved_available_capacity": 973,
"max_over_subscription_ratio": None,
"total_available_capacity": 973,
"provisioned_capacity": 100,
"calculated_free_capacity": 873,
"virtual_free_capacity": 873,
"free_percent": 89.72,
"provisioned_ratio": 0.1028,
"provisioned_type": "thick"
},
{
"total_capacity": 1024,
"free_capacity": 100,
"reserved_capacity": 51,
"total_reserved_available_capacity": 973,
"max_over_subscription_ratio": 2,
"total_available_capacity": 1946,
"provisioned_capacity": 100,
"calculated_free_capacity": 1846,
"virtual_free_capacity": 1846,
"free_percent": 94.86,
"provisioned_ratio": 0.05,
"provisioned_type": "thin"
}
],
}
]
}
Security impact¶
None
Active/Active HA impact¶
None
Notifications impact¶
None
Other end user impact¶
None
Performance Impact¶
None
Other deployer impact¶
None
Developer impact¶
None
Implementation¶
Assignee(s)¶
- Primary assignee:
hemna (Walter A. Boring IV)
Work Items¶
Add new microversion
Add
capacity_factors
in get-pools API response
Dependencies¶
This requires the new calculate_capacity_factors() function that is in the following review: https://review.opendev.org/c/openstack/cinder/+/831247
Testing¶
Add new unit tests to show the factors being returned in API call.
Documentation Impact¶
Add documentation to describe the capacity factors and the API response change
References¶
This was discussed at length at the Zed PTG: https://etherpad.opendev.org/p/zed-ptg-cinder
Youtube Video of discussion: https://www.youtube.com/watch?v=6yuOlGckkGE
The capacity factors definitions https://specs.openstack.org/openstack/cinder-specs/specs/queens/provisioning-improvements.html