Add notifications about resources CRUD and node states¶
https://bugs.launchpad.net/ironic/+bug/1606520
This spec proposes addition of new notifications to ironic: CRUD (create, update, or delete) of resources and node state changes for provision state, maintenance and console state.
Problem description¶
Resource indexation services like Searchlight [1] require notifications about creation, update or deletion of a resource. Currently CRUD notifications are not implemented in ironic. Creating an efficient plugin for Searchlight is impossible without these notifications. Ironic node notifications for provision state, maintenance and console state also could be used by Searchlight plugin in order to keep Searchlight’s index of ironic resources up-to-date.
Apart from searchlight, there is a use case of monitoring service, that caches all notification payloads along with event type, like start/end/error/<etc> and an operator can query this service to see if ironic is behaving properly. For example, if there are much more start notifications for node create, than there are end notifications, it may mean that the database is not behaving properly, or messaging is having a hard time delivering messages between API and conductor. That is a separate case from searchlight: searchlight for example does not need to know the payload of the node create start notification, as there is no actual node yet, but for monitoring purposes, it may be useful.
Proposed change¶
As a general note for all CRUD notifications, *.start
and *.error
event
payloads will be ignored by Searchlight, as in both cases it would mean that
resource representation has not changed, or in case of *create*
notifications, that the resource was not created.
Node CRUD notifications¶
The following event types will be added:
“baremetal.node.create.start”;
“baremetal.node.create.end”;
“baremetal.node.create.error”;
“baremetal.node.update.start”;
“baremetal.node.update.end”;
“baremetal.node.update.error”;
“baremetal.node.delete.start”;
“baremetal.node.delete.end”;
“baremetal.node.delete.error”.
Priority level - INFO or ERROR (for “error” status). Payload contains all
fields from base NodePayload
with additional fields: chassis_uuid
,
instance_info
, driver_info
. Secrets in the node fields will be masked.
raid_config
and target_raid_config
fields are excluded because they can
contain low-level disk and vendor information. If/when there is a use case for
them, they can be added in the future. All these notifications will be
implemented at the API level.
Port CRUD notifications¶
The following event types will be added:
“baremetal.port.create.start”;
“baremetal.port.create.end”;
“baremetal.port.create.error”;
“baremetal.port.update.start”;
“baremetal.port.update.end”;
“baremetal.port.update.error”;
“baremetal.port.delete.start”;
“baremetal.port.delete.end”;
“baremetal.port.delete.error”.
Priority level - INFO or ERROR (for “error” status).
Payload contains these fields: uuid
, node_uuid
, address
, extra
,
local_link_connection
, pxe_enabled
, created_at
, updated_at
.
These notifications will be implemented at the API level. In addition,
“baremetal.port.create.*” will be emitted by the ironic-conductor service
when driver creates a port (examples are [2] and [3]).
Chassis CRUD notifications¶
The following event types will be added:
“baremetal.chassis.create.start”;
“baremetal.chassis.create.end”;
“baremetal.chassis.create.error”;
“baremetal.chassis.update.start”;
“baremetal.chassis.update.end”;
“baremetal.chassis.update.error”;
“baremetal.chassis.delete.start”.
“baremetal.chassis.delete.end”.
“baremetal.chassis.delete.error”;
Priority level - INFO or ERROR (for “error” status).
Payload contains these fields: uuid
, extra
, description
,
created_at
, updated_at
. All these notifications will be implemented at
the API level.
Node provision state notifications¶
Will be implemented via TaskManager methods (and emitted by the ironic-conductor service).
Types of events for node provision state:
“baremetal.node.provision_set.start”;
“baremetal.node.provision_set.end”;
“baremetal.node.provision_set.error”;
“baremetal.node.provision_set.success”.
Types of state changing in ironic and corresponding events:
Start transition, spawning a working thread: “start” notification with INFO level.
End transition, cleaning
target_provision_state
: “end” notification with INFO level.Error events processing: “error” notification with ERROR level.
Change
provision_state
without starting a worker that is not “end” or “error”: “success” notification with INFO level. Examples are DEPLOYING <-> DEPLOYWAIT, AVAILABLE -> MANAGEABLE.
Payload contains all fields from base NodePayload
with additional fields:
instance_info
, previous_provision_state
,
previous_target_provision_state
, event
(FSM event that triggered the
state change).
To efficiently use the provision state notifications all related node changes
(like setting of last_error
, maintenance
) should be done before event
processing.
Node maintenance notifications¶
The following event types will be added:
“baremetal.node.maintenance_set.start”;
“baremetal.node.maintenance_set.end”;
“baremetal.node.maintenance_set.error”.
Priority level - INFO or ERROR (for “error” status). Payload contains all
fields from base NodePayload
. All these notifications will be implemented
at the API level and reflect maintenance changes to a node due to a user
request. There won’t be any explicit node maintenance notifications for
maintenance changes done internally by ironic. Since these internal changes
occur as a result of trying to change the node’s state (e.g. provision, power),
one of the other notifications that is emitted will “cover” these internal
maintenance changes.
Node console notifications¶
The following event types will be added:
“baremetal.node.console_set.start”;
“baremetal.node.console_set.end”;
“baremetal.node.console_set.error”;
“baremetal.node.console_restore.start”;
“baremetal.node.console_restore.end”;
“baremetal.node.console_restore.error”.
console_set
action is used when start or stop console is initiated via API
request, console_restore
action is used when console_enabled
flag is
already enabled in the DB for node and console restart via driver is required
(due to dead or restarted ironic-conductor process). Priority level - INFO or
ERROR (for “error” status). Payload contains all fields from base
NodePayload
. All these notifications will be implemented in the
ironic-conductor, because setting of a node’s console is an asynchronous
request, so ironic-conductor can easily emit notifications for the start/end of
the change.
Alternatives¶
Periodically polling ironic resources via API.
Data model impact¶
None
State Machine Impact¶
None
REST API impact¶
None
Client (CLI) impact¶
None
RPC API impact¶
None
Driver API impact¶
None
Nova driver impact¶
None
Ramdisk impact¶
None
Security impact¶
None
Other end user impact¶
None
Scalability impact¶
If notifications are enabled, they can create high load on the message bus during node deployments on large environments.
Performance Impact¶
None
Other deployer impact¶
Deployers should set already existing notification_level
config options
properly.
Developer impact¶
If developer creates resources in the driver, proper notification should be emitted.
For provision state change all related node updates should be done before event processing.
Implementation¶
Assignee(s)¶
- Primary assignee:
yuriyz
- Other contributors:
vdrok
mariojv
Work Items¶
Implement node provision state change notifications.
Implement CRUD notifications and node maintenance notifications.
Implement console notifications.
Add notifications to the current ironic code that creates resources in the drivers.
Fix ironic code with node updates after event processing.
Dependencies¶
Patch with base NodePayload
[4].
Testing¶
Unit tests will be added.
Upgrades and Backwards Compatibility¶
None
Documentation Impact¶
New notifications feature will be documented.