Apache Kafka is messaging broker which enables to establish nice log-centric communication bus in OpenStack.
Some OpenStack services like Ceilometer and Monasca have the collaboration part with Apache Kafka, distributed messaging system, to publish and subscribe logs and metrics data. Since Kafka is well-designed log-centric messaging broker, it will be expected to appear the new Kafka related implementation in existing or future OpenStack projects. This blueprint aims to implement kafka driver in oslo_messaging, which enable us to reduce the wasting time to re-implement the same functionality, and since Kafka would have multiple clients from separate projects, to use the same client to communicate with Kafka will make the reduce of the implementation-dependent bugs and errors.
To reduce the cost of independently implementing kafka communication functionality, let’s support for the kafka driver in oslo_messaging. Kafka driver enables us to publish and subscribe messages similar to the other oslo_messaging drivers, but will not support for the RPC methods since Kafka is not expected to be used as a RPC broker. In the future, if there are some needs for Kafka RPC, it will be a new proposal.
There are several messaging brokers currently not supported by oslo_messaging, for example NATS, NSQ and etc. However, Kafka have already been used more than one OpenStack projects, and also since Kafka is developed for handling high load logs, Kafka have several features which are better for logging data. For example, Kafka queue retains the messages for configured period of time. This means that when consumers of the queue lost messages after subscribing, they can replay the same message. Also additional kafka brokers and consumers are easily joined, because horizontal scaling of kafka is supported by Zookeeper. Moreover, kafka can send the bunch of messages together, this can contribute the performance and flexibility of messaging.
Current security support of Kafka are under implementation. Authentication of clients and encrypting connections would be coming. More information can be found in the references.
There would be no impact unless the kafka is selected for use. The overall performance will depend on the server component which the broker is setted up. The paper related to Kafka says that producers can publish about 50,000 messages/sec on the condition that message size is 200 byte and messages are sent one by one. However, this number is affected by parameters such as message size, the number of replication, batch size (kafka can send multiple messages together), and environments. Roughly, RabbitMQ can publish about 10,000 messages/sec, it means kafka might not be the bottleneck of performance.
Currently Ceilometer and Monasca are using Kafka service. Ceilomter project have Kafka publisher as a metrics publisher, and Monasca project is using kafka as a part of their architecture. For more detail, see the References chapter.
OpenStack developer documentation for oslo_messaging library will have the description for the behaviors and how to select kafka driver plugin.
Apache Kafka Project
Kafka: a Distributed Messaging System for Log Processing
KAFKA 0.8 PRODUCER PERFORMANCE
Performance of Kafka
RabbitMQ Performance Benchmarks
Comparison of Messaging Queues
This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/legalcode