On-demand Generate PCI Device Pools

https://blueprints.launchpad.net/nova/+spec/pci-stats-generate

This proposal is to generate PCI device pool information on-the-fly instead of storing the summary pool information in the database.

Problem description

The compute_nodes table currently stores in the pci_stats field a JSON representation of PCI device “pools”. This information is updated by the nova-compute resource tracker by the nova.pci.stats.PciDevStats class and read by the Nova scheduler in each iteration of the select_destinations() call when the nova.objects.ComputeNodeList.get_all() method is used to pull all information about compute nodes in the system. The reason that this summary information is pulled by the scheduler is to avoid having to send message containing thousands of PCI device records across the wire.

The problem with storing this summary information in the compute_nodes table is two-fold:

1) There is the possibility that the summary information can get out of sync with the non-summary information stored in the pci_devices table, and

2) It interferes with our efforts to represent all resources in the system in a consistent and generic fashion (the resource-objects blueprint work)

Use Cases

As a developer of Nova, I want to be able to represent all quantitative resources in the system in a consistent and generic fashion. As an operator, I do not want summary and detail information in my database to get out of sync.

Proposed change

We propose the following changes to the Nova code base:

1) Temporarily duplicate the logic of nova.pci.stats.PciDevStats.supports_request() method into the nova.objects.PciDevicePoolList object.

2) Move the logic for determining if a compute node can provide a requested PCI device to an instance from the nova.pci.stats.PciDevStats.consume_requests() method to the nova.pci.manager.PciDevTracker._claim_for_instance() method.

3) Modify the nova.objects.ComputeNode object to load on-demand the pci_device_pools field by a subquery instead of pulling from the compute_nodes.pci_stats field in the database. The PciDevicePoolList object can be generated using a single SQL query on the pci_devices table, like so:

SELECT product_id, vendor_id, numa_node, COUNT(*) as count
FROM pci_devices
WHERE compute_node_id = ?
GROUP BY product_id, vendor_id, numa_node;

This will only be used for legacy compute nodes that rely on the nova.objects.ComputeNode.pci_device_pools field attribute.

4) Change the scheduler’s host manager to load PCI device pool information using a new nova.objects.PciDevicePoolList.get_all method that returns all PCI device pool information for all compute nodes, but only when the PciPassthroughFilter is enabled. This will match how the HostAggregate information is loaded by the scheduler and collated to HostState objects.

The SQL statement for grabbing all of the PCI device pool information for compute nodes looks like this:

SELECT compute_node_id, product_id, vendor_id, numa_node, COUNT(*) as count
FROM pci_devices
GROUP BY compute_node_id, product_id, vendor_id, numa_node;

5) Change the nova.scheduler.pci_passthrough_filter.host_passes method to use the nova.objects.PciDevicePoolList.supports_requests() method instead of the nova.pci.stats.PciDevStats.support_requests() method.

  1. Remove the nova.pci.stats module entirely.

7) Deprecate the compute_node.pci_stats field in the database and mark it for removal in the N release.

Alternatives

None.

Data model impact

None, this changes the implementation of existing model definitions only.

REST API impact

None.

Security impact

None.

Notifications impact

None.

Other end user impact

None.

Performance Impact

None. The over-wire information will continue to be the same. The database query for generating the summary PCI device information should be very quick.

Other deployer impact

None.

Developer impact

This will allow the resource-objects blueprint to proceed, since PCI device resources will be able to be handled in the same way as NUMA or other quantitative resources.

Implementation

Assignee(s)

Primary assignee:

dstepanenko

Other contributors:

jaypipes

Work Items

1) Duplicate supports_request() method into the nova.objects.PciDevicePoolList object.

2) Move nova.pci.stats.PciDevStats.consume_requests() to the nova.pci.manager.PciDevTracker._claim_for_instance() method.

3) Modify the nova.objects.ComputeNode object to load on-demand the pci_device_pools field

4) Change the scheduler host manager to load PciDevicePoolList object for all compute nodes in the same way that host aggregate information is loaded, and only when the PciPassthroughFilter is enabled.

5) Change the nova.scheduler.pci_passthrough_filter.host_passes method to use the nova.objects.PciDevicePoolList.supports_requests() method

  1. Remove the nova.pci.stats module entirely.

7) Annotate the nova.db.sqlalchemy.models.ComputeNode.pci_stats field in the database as deprecated.

Dependencies

None.

Testing

Should be a net reduction in unit tests since the logic for decrementing the PCI device pool counts will be removed entirely.

Documentation Impact

None. No user-facing changes.

References

None.

History

None.