Plugin for CDH with Cloudera Manager¶
https://blueprints.launchpad.net/sahara/+spec/cdh-plugin
This specification proposes to add CDH plugin with Cloudera Distribution of Hadoop and Cloudera Manager in Sahara.
Problem description¶
Cloudera is open-source Apache Hadoop distribution, CDH (Cloudera Distribution Including Apache Hadoop). CDH contains the main, core elements of Hadoop that provide reliable, scalable distributed data processing of large data sets (chiefly MapReduce and HDFS), as well as other enterprise-oriented components that provide security, high availability, and integration with hardware and other software. Cloudera Manager is the industry’s first end-to-end management application for Apache Hadoop. Cloudera Manager provides many useful features for monitoring the health and performance of the components of your cluster (hosts, service daemons) as well as the performance and resource demands of the user jobs running on your cluster. [1]
Proposed change¶
CDH plugin implementation will support Cloudera Manager version 5 and CDH version 5.
Plugin will support key Sahara features:
Cinder integration
Cluster scaling
EDP
Cluster topology validation
Integration with Swift
Data locality
Plugin will be able to install following services:
Cloudera Manager
HDFS
YARN
Oozie
CDH plugin will support the following OS: Ubuntu 12.04 and CentOS 6.5. CDH provisioning plugin will support mirrors with CDH and CM packages.
By default CDH doesn’t support Hadoop Swift library. Integration with Swift should be added to CDH plugin. CDH maven repository contains Hadoop Swift library. [2]
CDH plugin will support the following processes:
MANAGER - Cloudera Manager, master process
NAMENODE - HDFS NameNode, master process
SECONDARYNAMENODE - HDFS SecondaryNameNode, master process
RESOURCEMANAGER - YARN ResourceManager, master process
JOBHISTORY - YARN JobHistoryServer, master process
OOZIE - Oozie server, master process
DATANODE - HDFS DataNode, worker process
NODEMANAGER - YARN NodeManager, worker process
Alternatives¶
None
Data model impact¶
None
REST API impact¶
None
Other end user impact¶
None
Deployer impact¶
None
Developer impact¶
None
Sahara-image-elements impact¶
CDH plugin must be support vanilla images and images with Cloudera packages. For building pre-installed images with Cloudera packages use specific CDH elements.
Sahara-dashboard / Horizon impact¶
None
Implementation¶
Assignee(s)¶
- Primary assignee:
sreshetniak
- Other contributors:
iberezovskiy
Work Items¶
Add implementation of plugin
Add jobs in Sahara-ci
Add integration tests
Add elements to Sahara-image-elements for building images with pre-installed Cloudera packages
Dependencies¶
Depends on OpenStack requirements, needs a cm_api python library version 6.0.2, which is not present in OS requirements. [3] Need to add cm_api to OS requirements. [4]
Testing¶
Add unit tests to Sahara to cover basic functionality of plugin
Add integration tests to Sahara
Documentation Impact¶
CDH plugin documentation should be added to the plugin section of Sahara docs.