There is no way to enable kerberos integration for clusters deployed by Sahara. By the way, default managament services like Cloudera Manager and Ambari Management console already have support of configuring Kerberos for deployed clusters. We should make initial integration of this feature in sahara for Cloudera and Ambari plugins.
Users wants to use Kerberos to make clusters created by Sahara more secure.
The proposed changes will be the following:
- the initial MIT of KDC integration on cluster nodes. Also admin principal will be created for sahara usages (and sahara will store that using improved secret storage);
- ability to configure Kerberos on clusters created by using Cloudera Manager API (see the reference below , , ) and Ambari (see the reference below ) will be added;
- remote operations which are requires auth should be wrapped by ticket granting method so that sahara will be able to perform operations with hdfs and something like that;
- Oozie client should be re-implemented to reflect these changes. By default, in case when cluster is not deployed with Kerberos (like vanilla) we will continue using requests python library to bind Oozie API (without needed auth). The new Oozie client should be implemented in case of kerberos implemented. This client will use standard remote implementation and curl in order to process auth to Oozie with ticket granting. As another way, we can use request-kerberos for that goal, but it’s not good solution since this lib are not python3 compatible.
New config options will be added in general section of cluster template (or cluster):
- Enable Kerberos: will enable the kerberos security for cluster (Ambari or CDH);
As additional enhancement support of using existing KDC server can be added. In such case, additional options are required for that:
- Use existing KDC: will enable using existing KDC server, additional data should be provided then;
- Admin principal: admin principal to have ability to create new principals;
- Admin password: will be hidden from API outputs and will be stored in improved secret storage;
- KDC server hostname: hostname of KDC server
If something additional will be needed to indentify KDC server it also will be added to general section of configs.
Other possible improvements can be done after implementation of steps above. By the way, initial implementation will only include steps above.
None. If needed, extra field will be used for that goal.
All packages required for KDC infrastucture should be included on our images. If something is not presented, it will be additionally installed.
Nothing additional is required on Sahara dashboard side, since all needed options are in general section and will be prepared as usual
All working items are will described in Proposed change. Additionally, we can include testing initial Kerberos integration on Sahara CI, but only when all steps are completed. For sure, unit tests will be added.
Depends on OpenStack requirements. As initial idea, nothing additional should not be added to current sahara requirements, all needed packages will be included only on sahara images.
Sahara CI will cover that change, unit tests for sure will be added.
New sections can be added with description of Kerberos integration in Ambari and Cloudera.
 https://github.com/cloudera/cm_api/blob/f4431606a690d95208457a64d1cc2610d9cfa2bf/python/src/cm_api/endpoints/cms.py#L134  https://github.com/cloudera/cm_api/blob/f4431606a690d95208457a64d1cc2610d9cfa2bf/python/src/cm_api/endpoints/clusters.py#L585  http://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_sg_using_cm_sec_config.html  https://cwiki.apache.org/confluence/display/AMBARI/Automated+Kerberizaton#AutomatedKerberizaton-TheRESTAPI