Coordination of Notification Agents¶
Currently, the notification agent can function in active/active HA where each notification agent grabs notifications from the message queue in a round robin type rotation (it checks for message(s), grabs message(s)). The issue that arises is that if we want to implement a pipeline to process events, we cannot guarantee what event each agent worker will get and because of that, we cannot enable transformers which aggregate/collate some relationship across similar events.
This also fixes an existing bug where if multiple notification agents are enabled, the pipeline transformers may not work as expected because it will only act on the samples it sees on current worker.
Similar to how we implemented coordination between the central agents, this bp is to proposed a way to coordinate which event gets passed to which notification agent.
The proposed solution is to implement additional processing steps as follows:
1. Existing listener in notification agent continues as is, each agent will listen to the same queue and grab messages as they arrive without any coordination. 2. After the agent has converted messages into samples and events, the result will be republished onto a ceilometer internal queue. The sample and event will be published to (multiple) topics corresponding to known pipeline sinks. For example, the pipeline has two sinks, one that processes all meters, the other only processes compute meters. In this case, all agents will publish to a queue for 'all-meters' sink, and agents with compute meters will publish to 'compute-only' sink. 3. An additional listener will be added to each agent which will listen to new internal queues. This is where the coordination will happen. Each agent will listen to a set of targets created in step 2.
The existing notification agent will exists as is as there isn’t a need for additional queueing to handle single worker scenario.
The notification agent can be coordinated across existing targets ie. each existing projects exchange. This does not allow for cross exchange sinks. ie. we cannot have an aggregation that combines events from different projects. This might be a non-issue for samples (who needs to aggregate and transform a new sample from samples from nova and neutron?). It will probably be an issue for events which we may want to coordinate across exchanges. This would mean we should have samples listener to coordinate across exchanges, and events listener which would do the above.
The notification agent can be coordinated across endpoints. This will cause issues as we need to have duplicate queues that each agent listens to (to ensure agent with endpoint gets the message it needs). It also runs into potential loss or duplicate messages because of duplicate queues.
Data model impact¶
REST API impact¶
Same as existing security concerns polling agents face – we need to ensure that phantom agents can’t register and create phantom tasks.
It should actually work properly when multiple workers deployed. Additional event pipeline will be added in a subsequent bp and patch.
Other end user impact¶
This should improve write performance (at least from SQL pov) as we can do batch inserts.
Other deployer impact¶
Users will need to enable coordination. By default, the notification agent is expected to continue as currently ie. pull messages if they exists.
- Primary assignee:
- Ongoing maintainer:
add coordination of notification agent
reuse or add tests for coordination
First step to adding in event specific pipeline.
one of the backends for tooz (ZooKeeper, memcached, redis, etc…)
coordination tests exist. We may just need to add it for notification agent if the existing agent coordination tests are central agent specific.
add notes on enabling coordination of notification agents.