Cyborg Xilinx FPGA Driver Proposal

https://blueprints.launchpad.net/openstack-cyborg/+spec/add-xilinx-fpga-driver

This spec proposes to provide the initial design for Cyborg’s Xilinx FPGA driver.

Problem description

We want to use Cyborg manage the Xilinx FPGA devices, but there is not driver for it, and we want to add the Xilinx FPGA driver to support managing Xilinx FPGA devices by Cyborg.

Use Cases

  • As an operator, I would like to use Cyborg agent starts or does resource checking periodically, the Cyborg Xilinx FPGA driver should provide discover() function to enumerate the list of the Xilinx FPGA devices, and report the details of all available Xilinx FPGA accelerators on the host, such as PID(Product id), VID(Vendor id), Device.

  • As a user, I would like to boot up a VM with Xilinx FPGA device. Cyborg should be able to manage this kind of acceleration resources, assign it to the VM(binding) and programming it.

Proposed changes

Xilinx FPGA is PCIe based platform that is comprised of physical partitions called Shell and User. The Shell has two physical functions: privileged PF0 also called mgmt pf and non-privileged PF1 also called user pf. Shell provides basic infrastructure for the Alveo platform. User partition (otherwise known as PR-Region) contains user compiled binary.[1]_

User PF is only displayed after installing Xilinx driver xrt. And the difference between the PCI address of user PF and mgmt PF is the func field. For example, pci address of mgmt PF is ‘0000:3b:00.0’, and the address of user PF is ‘0000:3b:00.1’. The func field increase 1. So the pci address of user PF can be calculated by the address of mgmt PF. This calculation rule also applies to ‘product’.

From the above description, the Xilinx FPGA driver should implement the following methods:

  • discover(): This function works by executing lspci command and reports devices’ raw info sample as following:

    [
      {
      "vendor": "10ee",
      "product": "5000",
      "device": "0000:3b:00.0"
      }
    ]
    

    In above example, “5000” is the mgmt PF. So the driver should add three traits for one device, including “CUSTOM_FPGA_<VENDOR_ID>”, “CUSTOM_FPGA_PRODUCT_ID_<ID>”. Also the driver should create records in deployable table only for the device. For now, only when both mgmt PF and user PF are binded to one vm, and end user can program the FPGA device. So there are two attach handles when binding it to vm.

  • program(): This function works by executing xbmgmt program command which is included with the XRT installation package.[2]_ Also, this function should be called for pre-program before binding to vm.

Below is the objects to describe a FPGA device which complies with the Cyborg database mode and Placement data model.

Hardware     Driver objects       Placement data model
   |               |                      |
1 FPGA         1 devices                  |
   |               |                      |
   |         1 deployable       ---> resource_provider
   |               |            ---> parent resource_provider: compute node
   |               |                      |
1 FPGA       2 attach_handle    ---> inventories(total:1)

Alternatives

None

Data model impact

Xilinx FPGA driver will not touch Data model. The Cyborg Agent can call Xilinx FPGA driver to update the database during the discover, bind and unbind operations.

REST API impact

None

Security impact

None

Notifications impact

None

Other end user impact

None

Performance Impact

None

Other deployer impact

  • Add new driver xilinx_fpga_driver to enabled_drivers for agent.

Developer impact

None

Implementation

Assignee(s)

Primary assignee:

Eric Xie

Work Items

  • Implement Xilinx FPGA driver in Cyborg

  • Add related test cases.

Dependencies

None

Testing

  • Unit tests will be added to test this driver.

Documentation Impact

Document Xilinx FPGA driver in Cyborg project. Test report in Cyborg wiki

References

History

Revisions

Release

Description

Yoga

Introduced