Our deployment process is very complicated. There are a lot of Puppet modules used together by our manifests and dependencies between these modules are also very complex and abundant. This leads to the following consequences: - It becomes very difficult to add new features. Even changes that could look minor from the first glance can easily and randomly break any other functionality. Trying to guess how any change could affect dependencies and ordering of deployment process is very hard and error-prone. - Debugging is also affected. Localizing bugs could be very troublesome due to manifests being complex and hard to understand. Debugging tools are also almost non-existent. - Reproducing bugs and testing takes lots of time because we have no easy and reliable way to repeat only some part of the deployment. The only thing we can do is to start the process again and wait for several hours to get any results. Snapshots are not very helpful because deployment cannot be reliably stopped and the state saved. These actions most likely break deployment or at least change its outcome. - New members of our team or outside developers who want to add some new functionality to our project are completely out of luck. They will have to spend many days just to gain minor understanding how our deployment works. And most likely will make a lot of hard to debug mistakes. - Using our product is also not as easy as we would like it to be for customers and other people in the community. People usually cannot easily understand how the deployment works and have to just follow every step in documentation. It makes them unable to act reasonably if something goes wrong.
If we want to address any of these issues we should find a way to make our architecture less complex and more manageable. It’s known that the best way to understand any large and monolithic structure is to to take it apart and then learn how does each of the pieces work and then how do they interact with each other.
So we should try to separate the whole deployment process to many small parts that could do only one or several closely related tasks. Each of these parts would be easy to understand for a single developer. Testing and debugging could also be done separately so localizing and fixing bugs would be much easier than it is now.
Thinking about the deployment process as a list of atomic tasks will make our reference architectures and server roles much more dynamic. If you change what tasks you are going to perform and their order you can create as many custom sets of roles as you need without any modification of the tasks themselves.
Each task can have some internal dependencies but most likely there would not be too many of them. It will make manual analyzing of dependency graph possible within a single task. The task can also have some requirements. System should be in the specific state before the task can be started.
The introduction of Granular Deployment will be a rather extensive change to almost all components of the Fuel project and a serious architectural modification.
Several types of tasks will be introduced in addition to basic deployment types, like puppet, shell, rsync, upload_file. This types are groups and stages, and they will serve the purpose to build flexible graph of tasks.
Types of tasks:
- type: group - grouping of the tasks based on nailgun role entities
- type: stage - skeleton of deployment graph in fuel, right now there
is 3 stages: pre_deployment, deployment, post_deploment
deployment tasks:
- type: puppet - executes puppet manifests with puppet apply
- type: shell - executes any shell command, like python script.py,
/script.sh
- type: upload_file - used for configuration upload to target nodes,
repo creation
- type: rsync
Ideally all dependencies between tasks should be described with requires and required_for attributes, it will allow us to build graph of tasks in nailgun and then serialize it into orchestrator acceptable format (workbooks for mistral, or astute-speficic roles with priorities).
id: controller
type: group
role: [<roles>]
requires: [<groups>]
required_for: [<stages>, <roles>]
tasks: [<tasks>]
parameters:
strategy:
type: parallel
amount: 8
id: deploy
type: stage
requires: [<stages>]
Right now we are using hardcoded set of stages, but it is completely possible to make it flexible, and define them with API.
id: deploy_legacy
type: puppet
role: [primary-controller, controller,
cinder, compute, ceph-osd]
requires: [<tasks>]
required_for: [<stage>]
parameters:
puppet_manifest: /etc/puppet/manifests/site.pp
puppet_modules: /etc/puppet/modules
timeout: 360
id: network
type: shell
groups: [primary-controller, controller]
requires: [deploy_legacy]
required_for: [deploy]
parameters:
cmd: python /opt/setup_network.py
timeout: 600
Major part of tasks will require conditional expressions. There is several ways to solve it:
1. Implement python framework for pluging a task. Each task will have clear interface for defining condition for a task, and if this condition passes - task will be serialized. This the most scalable and solid solution, but developing such framework will require a lot of effor, and we wont be able to land it in 6.1
2. Define conditions in custom expression parser that is also used on UI. There is couple of downsides with this approach: - Not all conditions can be expressed. For example, if zabbix-role present in cluster - deploy zabbix-agent for each role - It is new expression language, which we need to support ourselves - It depends on context data, which is quite easy to change
3. Define certain groups for tasks, and each mutually exclusive task will be able to specify its group. - This wont work with conditions that are not mutually exclusive.
4. Using strict API for conditions that can be used in expressions parsing. Pros: - it is not a new language - it has very strict api, so atleast we can try to guarantee its stability - complex abstract logic can be hidden in simple python methods
Stements will be expressed in the form of:
api.cluster_status == ‘operational’ api.role_in_deployment(‘ceph-osd’) api.role_in_cluster(‘zabbix-server’) api.cluster_status == ‘new’ and api.nodes_count() > 10
class ExpressionApi(object):
def __init__(self, cluster, nodes):
self.cluster = cluster
self.nodes = nodes
def role_in_deployment(self, role):
for node in self.nodes:
if role in node.roles:
return True
return False
def role_in_cluster(self, role):
for node in self.cluser.nodes:
if role in node.roles:
return True
return False
def nodes_count(self):
return len(self.nodes)
@property
def cluster_status(self):
return self.cluster.status
env = jinja2.Environment()
expr = env.compile_expression("api.cluster_status == 'operational'
and api.nodes_count() < 9")
print expr(api=API)
In 6.1 we will either stick to existing expression language that is used for cluster settings validation.
Operators are available in [2].
Based on provided tasks and dependencies between tasks we will build graph object with help of networkx library [1]. Format of serialized information will depend on orchestrator that we will use in any particular release.
Let me provide an example:
Consider that we have several types of roles:
- id: deploy
type: stage
- id: primary-controller
type: group
role: [primar-controller]
required_for: [deploy]
parameters:
strategy:
type: one_by_one
- id: controller
type: group
role: [controller]
requires: [primary-controller]
required_for: [deploy]
parameters:
strategy:
type: parallel
amount: 2
- id: cinder
type: group
role: [cinder]
requires: [controller]
required_for: [deploy]
parameters:
strategy:
type: parallel
- id: compute
type: group
role: [compute]
requires: [controller]
required_for: [deploy]
parameters:
strategy:
type: parallel
- id: network
type: group
role: [network]
requires: [controller]
required_for: [compute, deploy]
parameters:
strategy:
type: parallel
And there is defined tasks for each role:
- id: setup_services
type: puppet
requires: [setup_network]
groups: [controller, primary-controller, compute, network, cinder]
required_for: [deploy]
parameters:
puppet_manifests: /etc/puppet/manifests/controller.pp
puppet_modules: /etc/puppet/modules
timeout: 360
- id: setup_network
type: shell
groups: [controller, primary-controller, compute, network, cinder]
required_for: [deploy]
parameters:
cmd: run_setup_network.sh
timeout: 120
For each role we can define different subsets of tasks, but for simplicity lets make this tasks applicable for each role.
Based on this configuration nailgun will send to orchestrator config in expected by orchestator format.
For example we have several nodes for deployment:
This nodes will be executed in following order: Deploy primary-controller node-1 Deploy controller node-4, node-2 - you can see that parallel amount is 2 Deploy controller node-3, node-5 Deploy network role node-7 and cinder node-6 - they depend on controller Deploy compute node-8 - compute depends both on network and controller
During deployment for each node 2 tasks will be executed sequentially:
Run shell script setup_network Run puppet setup_services
- id: update_hosts
type: puppet
role: '*'
stage: post_deployment
requires: [upload_nodes_info]
parameters:
puppet_manifest: /etc/pupppet/modules/update_hosts_file.pp
puppet_modules: /etc/puppet/modules 16
timeout: 3600
cwd: /
- id: rsync_puppet
type: rsync
role: '*'
stage: pre_deployment
parameters:
src: /etc/pupppet/{VERSION}
dst: /etc/puppet/modules
timeout: 3600
Execute deployment based not on roles, but on tasks. To consider this as alternative we need to modularize atleast each deployment role as separate manifest. So in current deployment model, there will be next set of manifests:
- controller.pp
- mongo.pp
- ceph_osd.pp
- cinder.pp
- zabbix.pp
- compute.pp
After this is done it is quite easy to transfrom this in simple set of tasks:
- id: primary-controller
type: puppet
required_for: [deploy]
role: [primary-controller]
strategy:
type: one_by_one
parameters:
puppet_manifest: /etc/puppet/controller.pp
- id: controller
type: puppet
requires: [primary-controller]
required_for: [deploy]
strategy:
type: parallel
amount: 2
parameters:
puppet_manifest: /etc/puppet/controller.pp
- id: compute
type: puppet
requires: [controller]
strategy:
type: parallel
parameters:
puppet_manifest: /etc/puppet/compute.pp
- id: cinder
type: puppet
requires: [controller]
strategy:
type: parallel
parameters:
puppet_manifest: /etc/puppet/cinder.pp
- id: ceph-osd
type: puppet
requires: [controller]
strategy:
type: parallel
parameters:
puppet_manifest: /etc/puppet/ceph.pp
As you see there is no separation between tasks and roles. For example there is next set of roles to nodes:
primary-controller: [node-1]
controller: [node-4, node-2, node-3, node-5]
cinder: [node-6]
ceph-osd: [node-7]
compute: [node-8]
Deploy /etc/puppet/controller.pp on uids [1] Deploy /etc/puppet/controller.pp on uids [2,3] in parallel Deploy /etc/puppet/controller.pp on uids [4,5] in parallel Deploy /etc/puppet/compute.pp on uids [8] and Deploy /etc/puppet/cinder.pp on uids [6] and Deploy /etc/puppet/cinder.pp on uids [7] in parallel
Current model will allow us to make multiple cross-reference tasks, like:
- id: put_compute_into_maintenance_mode
type: puppet
role: [primary-controller]
- id: migrate_vms_from_compute
type: puppet
role: [primary-controller]
requires: [put_vm_into_maintenance_mode]
- id: reinstall_ovs
type: puppet
role: [compute]
requires: [put_vm_into_maintenance_mode, migrate_vms_from_compute]
- id: make_compute_available
role: [primary-controller]
requires: [reinstall_vs]
It is not full format, but in general it will do next things:
In nailgun rpc receiver we will need to track status of each node deployment ourselvers, by validations process of tasks performed. So task executor (astute) will send which task is completed after each puppet execution.
In case if role was not present at the time of writing deployment_graph, it will specify all tasks it wants to execute in metadata for this role.
Astute facts: Nailgun will generate additional section for astute facts. This section will contain list of tasks with its priorities for specific role. Astute fact will be extended with tasks exactly in same format it is stored in database, so if we are generating fact for compute role, astute will have section like:
tasks:
-
priority: 100
type: puppet
uids: [1] - this is done for compatibility reasons
parameters:
puppet_manifest: /etc/network.pp
puppet_modules: /etc/puppet
timeout: 360
cwd: /
-
priority: 100
type: puppet
uids: [2]
parameters:
puppet_manifest: /etc/controller.pp
puppet_modules: /etc/puppet
timeout: 360
cwd: /
Each astute.yaml will have part of deployment graph executed for that particular role.
Several API requests will be added:
GET/PUT clusters/<cluster_id>/deployment_tasks Reads, updates deployment configuration for concrete cluster. It will be usefull if someone wants to execute deployment in unique order.
GET/PUT releases/<release_id>/deployment_tasks Reads, updates deployment configuration for release
GET will support filters by start_task and end_task parameters:
GET releases/<release_id>/deployment_tasks/?end_task=netconfig&start_task=hiera
Will return all tasks that should start from start_task and finish at end_task
Several commands will be added to operate on tasks and to manipulate deployment API
Download/Upload deployment tasks from nailgun API will be available both for clusters and releases, by default dir parameter is current directory.
fuel rel –rel 2 –deployment-tasks –download –dir ./ fuel rel –rel 2 –deployment-tasks –upload –dir ./
fuel env –env 2 –deployment-tasks –download –dir ./ fuel env –env 2 –deployment-tasks –upload –dir ./
Sync deployment tasks for releases:
fuel rel –sync-deployment-tasks –dir /etc/puppet
All tasks.yaml that will be found recursively in directory “dir” will be merged and sended for correct release version, there is 2 approaches that can be taken to match releases to tasks: 1. Match them by path 2. Match by config file that will on root level of tasks directory structure
Next set of commands is about deployment API, in general we will have ability to construct custom graph for concrete nodes.
Only this tasks will be executed on specified nodes.
Tasks specified in netconfig will be dropped from deployment.
Tasks required for pre_deployment to be ready will be executed, in this API we will traverse graph up to pre_deployment and execute those tasks
Start at netconfig and end execution at task that is used for galera installation.
After 6.1 release that task API that will be done as part of this feature will be considered as stable task API and we are going to support tasks described in that order.
Versioning will be done based on MOS version, so all tasks in any given version should conform to certain API version or not.
Deployment configuration will be stored in
Cluster.deployment_tasks Release.deployemtn_tasks
Initially graph configuraton will be filled on bootstrap_master_node stage, by api call to /release/<id>/deployment_tasks
If there will be any kind of incopatibilities with new deployment code and previous stored data - it will be possible to solve by migration or modification from upgrade script (by API calls).
Wont significantly affect deployment time. Maybe for some cases puppet run will be shorter.
We will need to put tasks from fuel-library for each release in nailgun, at the stage of bootstrap admin node.
Feature lead: - Dmitry Shulyak dshulyak@mirantis.com
Devs: - Vladimir Sharshov vsharhov@mirantis.com - Sebastian Kalinowski skalinowski@mirantis.com - Kamil Sambor ksambor@mirantis.com
Library: - Dmitry Ilyin dilyin@mirantis.com - Alex Didenko adidenko@mirantis.com
QA: - Tatyana Leontovich tleontovich@mirantis.com - Denis Dmitriev ddmitriev@mirantis.com - Anastasia Palkin apalkina@mirantis.com
python networkx library [1]
Every new piece of code will be covered by unit tests. This is internal functionality, therefore it will be covered by system tests without any modifications. Additional tests that will verify that we dont have regression in time of deployment. Tests that will create new task and add it into deployment graph, and then verify that node is in expected state. Acceptance critirea for each task granule will be added in another spec, eithre library modularizarion or modular tests.
Requires update to developer and plugin documentation.