Add HBase on Vanilla cluster

Add HBase on Vanilla cluster

Apache HBase provides large-scale tabular storage for Hadoop using the Hadoop Distributed File System(HDFS). This document serves as a description to add the support of HBase and ZooKeeper services on Vanilla cluster.

Problem description

Sahara vanilla plugin allows user to quickly provision a cluster with many core services, but it doesn’t support HBase and ZooKeeper.

Proposed change

To go against the Vanilla cluster distributed architecture, we only support fully-distributed HBase deployment. In a distributed configuration, the cluster contains multiple nodes, each of which runs one or more HBase Daemon. These include HBase Master instance, multiple ZooKeeper nodes and multiple RegionServer nodes.

A distributed HBase installation depends on a running ZooKeeper cluster. HBase default manages a ZooKeeper “cluster” for you, but you can also manage the ZooKeeper ensemble independent of HBase. The variable “HBASE_MANAGES_ZK” in “conf/”, which default to true, tells HBase whether to start/stop the ZooKeeper ensemble servers as part of HBase.

We should expose this variable in “cluster_configs” to let user determine the creator of ZooKeeper service.

In production, it is recommended that run a ZooKeeper ensemble of 3, 5 or 7 machines; the more members an ensemble has, the more tolerant the ensemble is of host failures. Also, run an odd number of machines. An even number of peers is supported, but it is normally not used because an even sized ensemble requires, proportionally, more peers to form a quorum than an odd sized ensemble requires.

  • If we set “HBASE_MANAGES_ZK” to false, Sahara will validate the number of ZooKeeper services in node groups to keep ZK instances in odd number.
  • If we set “HBASE_MANAGES_ZK” to true, Sahara will automatically determine the instances to start ZooKeeper. The cluster contains ZK nodes more than 1 nodes, less than 5 nodes. If we want to have more ZK nodes, setting HBASE_MANAGES_ZK to false would be a good choice.

If we want to scale the cluster up or down, ZooKeeper and HBase services will be restarted. And after scaling up or down, the rest of ZooKeeper nodes should also be kept in odd number. If there is only one ZooKeeper node, the status of ZooKeeper service will be “standalone”.

One thing should be specified is the default value used in configuration:

ZooKeeper Configuration in “/opt/zookeeper/conf/zoo.cfg”:


HBase Configuration in “/opt/hbase/conf/hbase-site.xml”:


Security Group will open ports (2181, 2888, 3888, 16000, 16010, 16020) after this change if configuration is not changed.


Data model impact


REST API impact


Other end user impact


Deployer impact


Developer impact


Sahara-image-elements impact

  • Build new Vanilla image includes ZK and HBase packages

Sahara-dashboard / Horizon impact

  • An option should be added to the Node Group create and update forms.



Primary assignee:
Shu Yingya

Work Items

  • Build new image by sahara-image-elements
  • Add ZooKeeper to Vanilla in sahara
  • Add HBase to Vanilla in sahara
  • Update Sahara-dashboard to choose ZK creator in sahara-dashboard




  • Unit test coverage in sahara

Documentation Impact

  • Vanilla plugin description should be updated



Creative Commons Attribution 3.0 License

Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.