Self-Service via Runbooks¶
https://bugs.launchpad.net/ironic/+bug/2027690
With the addition of service steps, combined with owner/lessee, we now have an opportunity to allow project members to self-serve many maintenance items by permitting them access to curated runbooks of steps.
This feature will primarily involve extending creating a new runbook concept, allowing lists of steps to be created, associated with a node via traits. These runbooks will then be able to be used in lieu of a list of steps when performing manual cleaning or node servicing.
Problem description¶
Currently, users of the Ironic API as a project-scoped member have limited ability to self-serve maintenance items. Ironic operators are given the difficult choice of giving users broad access to nodes, allowing them to run arbitrary manual cleaning or service steps with the only alternative being permitting no access to self-serve these maintenance items.
Use cases include:
As a project member, I can execute runbooks via Node Servicing without granting the ability to execute arbitrary steps on a node.
As an system manager, I want to store a list of steps to perform an action in an identical manner across many similar nodes.
Proposed change¶
The proposed change is to create a new API concept, runbooks, which can be used with any API flow which currently takes explicit lists of steps.
Those runbooks can then be used instead of a list of clean_steps or
service_steps [0] when setting node provision state. These are expected
to behave identical to API calls with clean_steps
or service_steps
provided, including honoring the disable_ramdisk
field, and providing
explicit ordering rather than the priority-based ordering that is used in
automated cleaning and deploys.
Additionally, we will ensure that the full CRUD lifecycle of runbooks is made role-aware in the code, so that a project can limit who can create, delete, edit, or mark runbooks as public all as separate policy toggles. We will also ensure deployers can separately toggle the ability to run step-based flows via runbooks versus step-based flows with arbitrary step lists.
A runbook will only run on a node who has a trait equal to the runbook name, to ensure the runbook has been approved for use on a given piece of hardware, as an extra precaution against hardware breakage.
Alternatives¶
We considered, originally, repurposing the existing deploy templates into a generic concept of templates. This was abandoned due to deploy templates containing implicit steps, making it difficult to reason about them. This is why we instead chose to call them runbooks, which are entirely specified as opposed to templates, which are partially specified and have implicit steps integrated.
Data model impact¶
Create new tables described below:
``runbooks`` (same as ``deploy_templates`` except addition of ``owner`` and ``public``)
- id (int, pkey)
- uuid
- name (string 255)
- public (bool) - When true, template is available for use by any project.
- owner (nullable string, usually a keystone project ID)
- disable_ramdisk - When true, similar behavior to disable_ramdisk in manual cleaning -- do not boot IPA
- extra json/string
- steps list of ids pointing to ``runbook_steps``
``runbook_steps``
- id (int, pkey)
- runbook_id (Foreign Key to runbooks.id)
- interface
- step
- args
- order (or some other field/method to indicate how the steps were ordered coming into the API)
Note: Ensure all queries to runbooks
only pull in runbook_steps
if
needed.
State Machine Impact¶
While no states or state transitions are being proposed, the APIs to invoke some of those state transitions will need to change to become runbook-aware.
REST API impact¶
A new top level REST API endpoint, /v1/runbooks/
will be added, with basic
CRUD support.
The existing /v1/nodes/<node>/states/provision
API will be changed to
accept a runbook
(name or uuid) in lieu of clean_steps
when being used
for servicing or manual cleaning.
Client (CLI) impact¶
The CLI will be updated to add support for the new API endpoints.
Some examples of CLI commands that will be added, and how they interact with RBAC:
- baremetal runbook create X [opts] # as system-scoped manager
- owner: null
- public: false
- baremetal runbook create X [opts] # as project-scoped manager
- owner: projectX
- public: false
- baremetal runbook set X --public # as system-scoped manager
- owner: null
- public: true
- Note: Owner field is nulled even if it previously set.
- baremetal runbook set X --public # as project-scoped manager
- Forbidden! Requires system-scoped access.
- baremetal runbook unset X --public # as system-scoped manager
- owner: null
- public: false
- baremetal runbook set X --owner projectX # as system-scoped manager
- owner: projectX
- public: false
- Note: Will return an error if ``runbook.public`` is true.
- baremetal node service N --runbook X
- baremetal node clean N --runbook X
- baremetal node service N --runbook X --service-steps {} # NOT PERMITTED
- baremetal node clean N --runbook X --clean-steps {} # NOT PERMITTED
RPC API impact¶
RPC API will be modified to support runbooks in lieu of steps where necessary. They will be properly versioned to ensure a smooth upgrade.
Driver API impact¶
None
Nova driver impact¶
None
Ramdisk impact¶
None
Security impact¶
Operators are warned that even with use of this feature, users may be able to leverage steps or access which are innocuous on their own, but malicious when combined.
Deployers should ensure they have reviewed all possible threat models when
granting additional access to less-trusted individuals – including
restricting unsafe node actions, such as replacing deploy_ramdisk
to
ensure runbooks (and other step-based workflows) operate as expected.
Things for the implementer to avoid to ensure secure implementation:
Do not permit a project-scoped API user to change
runbooks.public
by default.Do not permit a project-scoped API user change
runbooks.owner
by default.Anything that would implicitly mark a runbook as non-public.
Ensure we check if nodes are able to run a given runbook using node traits, in a similar method to how we do so with deploy templates.
RBAC Impact¶
There are two primary ways this feature interacts with RBAC, beyond the obvious CRUD for runbooks.
First, the runbooks.owner
and runbooks.public
fields are relevant
for determining if a runbook is scoped to a project or to a system. If
owner
is non-null and public
is false, the runbook is scoped to the
project set in that field and is only usable on nodes owned or leased by that
project. If owner
is null and public
is false, the runbook is only
able to be used or access by system-scoped users. If owner
is null and
public
is true, a system-scoped member can modify the runbook and a
project-scoped member could use it on a compatible node. Additionally,
the owner
field will only be settable when public
is false or being
set to false, and setting public
to true will null the owner field.
Second, the node change provision state [0] API will have a runbook
field
added, and policy will be different for cases where runbook
is specified
instead of clean_steps
. Default policy will be to permit manual cleaning
and servicing for a node owner or lessee-scoped member when using a runbook,
but to disallow it when specifying clean_steps
. Combining clean_steps
and runbook
will not be permitted.
Expected access after this implementation is complete:
System
- Admin
- Manager
- Member
--> Can CRUD system-scoped templates (template.owner=null)
--> Can CRUD project-scoped templates (template.owner=PROJECT)
--> Can unset template.owner, changing a template to system-scope
--> Can mark system-scoped templates as public (template.public=True)
- Reader
--> Can list all templates
Project
- Admin
- Manager
--> Can CRUD project-scoped templates (template.owner=PROJECT)
--> Cannot set a template to public (template.public=True).
- Member
--> Can execute public templates or templates owned by their project.
- Reader
--> Can list public templates and templates owned by their project.
Other end user impact¶
None
Scalability impact¶
None
Performance Impact¶
None
Other deployer impact¶
None
Developer impact¶
None
Implementation¶
Assignee(s)¶
- Primary assignee:
JayF <jay@jvf.cc>
- Other contributors:
TheJulia <juliaashleykreger@gmail.com>
Work Items¶
Create Runbooks - Object layer - DB layer - API layer
Add policy checking tests for /v1/runbooks
Ensure tempest API tests exist for new API endpoints
Update API-Ref
Update Manual Cleaning and Node Servicing documentation
Dependencies¶
All dependencies have been resolved.
Testing¶
Unit tests will be added to test the new functionality. Integration tests will be added to test the new API endpoints and CLI commands.
Upgrades and Backwards Compatibility¶
The changes are backwards compatible. Existing API endpoints will continue to function as before, and we will gate all API changes behind microversion checks.
Documentation Impact¶
The new functionality will need to be documented. This includes documentation for the new API endpoints and CLI commands, as well as documenting security caveats detailed above.