This work is licensed under a Creative Commons Attribution 3.0 Unported License. http://creativecommons.org/licenses/by/3.0/legalcode
Server Pools Manager¶
https://blueprints.launchpad.net/designate/+spec/server-pools-service
This specification outlines the Pool Manager, Central, backend driver, and storage changes needed to support the new Pool Manager service.
Problem description¶
Coordinating DNS operations across many different backends is difficult, especially when there is a great number of DNS servers. A Pool Manager service is needed to manage the changes from the Designate database to the many DNS servers. A Pool Manager will also track the status of those changes. When this specification is implemented, a Pool Manager will be used to manage a pool with multiple DNS servers, even if those DNS servers are of different types.
Proposed change¶
API Changes¶
None
Pool Manager Changes¶
A new Designate service, called designate-pool-manager, will be created. This is the Pool Manager. The Pool Manager will get its configuration from the configuration file when it is instantiated.
The configuration section is called [service:pool_manager]. The options for this section are:
Parameter |
Default |
Required |
Notes |
---|---|---|---|
pool_name |
‘default’ |
Yes |
The pool name of the pool managed by this instance of the Pool Manager |
threshold_percentage |
100 |
Yes |
The percentage of servers requiring a successful update for a domain change to be considered active |
poll_timeout |
30 |
Yes |
The time to wait for a NOTIFY response from a name server |
poll_retry_interval |
2 |
Yes |
The time between retrying to send a NOTIFY request and waiting for a NOTIFY response |
poll_max_retries |
3 |
Yes |
The maximum number of times minidns will retry sending a NOTIFY request and wait for a NOTIFY response |
periodic_sync_interval |
120 |
Yes |
The time between sychronizing the servers with Storage |
The Pool Manager will contain a map of the servers to instantiated backend drivers. The backend driver will not be responsible for reading the configuration information as the Pool Manager will read the global backend driver and server specific backend driver sections from the configuration file and pass the backend driver configuration to the backend driver for instantiation. This map will be created when the Pool Manager is instantiated. Please refer to the Backend Driver Changes section in the Storage Pools - Storage specification for more information concerning the global backend driver and server specific backend driver sections.
The methods in the base class for the Pool Manager service include:
create_domain(context, domain)¶
Parameter |
Description |
Required |
---|---|---|
context |
Security context information. |
Yes |
domain |
The designate domain object. |
Yes |
Return Value¶
No return value.
Design Considerations¶
Loop through each server in the pool and call the backend driver to create the domain. For each call to the backend driver, the status is stored in the pool_manager_status table with an action of ‘CREATE’ and a second row is created with an action of ‘UPDATE’. Successful creations have a status of ‘SUCCESS’ and failed creations have a status of ‘ERROR’. The ‘UPDATE’ action row has no initial status. Check to see if a consensus exists using the pool_manager_status table. Consensus exists if the number of servers for the domain with a successful creation exceed the threshold_percentage. If consensus exists, the Central update_status method is called using the serial number used when creating the domain and a status of ‘SUCCESS’. If consensus does not exist, the Central update_status method is called using the serial number used when creating the domain and a status of ‘ERROR’.
Cast vs. Call¶
This is an RPC cast. Communication about the status of the domain creation will be handled using the Central update_status method.
delete_domain(context, domain)¶
Parameter |
Description |
Required |
---|---|---|
context |
Security context information. |
Yes |
domain |
The designate domain object. |
Yes |
Return Value¶
No return value.
Design Considerations¶
Loop through each server in the pool and call the backend driver to delete the domain. For each call to the backend driver, the status is stored in the pool_manager_status table with an action of ‘DELETE’. Successful deletions have a status of ‘SUCCESS’ and failed deletions have a status of ‘ERROR’. Check to see if a consensus exists using the pool_manager_status table. Consensus exists if the number of servers for the domain with a successful deletion exceed the threshold_percentage. If consensus exists, the Central update_status method is called using the serial number used when deleting the domain and a status of ‘SUCCESS’. If consensus does not exist, the Central update_status method is called using the serial number used when creating the domain and a status of ‘ERROR’.
Cast vs. Call¶
This is an RPC cast. Communication about the status of the domain deletion will be handled using the Central update_status method.
update_domain(context, domain)¶
Parameter |
Description |
Required |
---|---|---|
context |
Security context information. |
Yes |
domain |
The designate domain object. |
Yes |
Return Value¶
No return value.
Design Considerations¶
Loop through each server in the pool and call the minidns notify_zone_changed method. Loop through each server again and call the minidns poll_for_serial_number method.
Cast vs. Call¶
This is an RPC cast. Communication about the status of the domain update will be handled using the Central update_status method which is called by the Pool Manager update_status method. The minidns poll_for_serial_number method invokes the Pool Manager update_status method when it completes.
update_status(context, domain, name_server, status, serial_number)¶
Parameter |
Description |
Required |
---|---|---|
context |
Security context information. |
Yes |
domain |
The designate domain object. |
Yes |
name_server |
The name server for which this serial number is applicable. |
Yes |
status |
The status, ‘SUCCESS’ or ‘ERROR’. |
Yes |
serial_number |
The serial number received from the name server for the domain. |
Yes |
Return Value¶
No return value.
Design Considerations¶
Reads the existing serial number from the pool_manager_status table for the server and domain. If the new serial number > the existing serial number, update the row and check to see if a consensus exists using the pool_manager_status table. Consensus exists if the number of servers for the domain with a serial number > the existing serial number exceed the threshold_percentage. Servers are discounted from participating in the consensus starting with the servers with the lowest serial numbers until the minimum number of servers needed to achieve consensus based on the threshold_percentage is realized. If the existing serial number < all the serial numbers for the remaining servers, the Central update_status method is called using the lowest (consensus) serial number for those remaining servers and a status of ‘SUCCESS’.
If > 100 - threshold_percentage servers for the domain have a status of ‘ERROR’, the Central update_status method is called using the lowest serial number greater than the consensus serial number (calculated above) and a status of ‘ERROR’.
Cast vs. Call¶
This is an RPC cast.
periodic_sync()¶
Return Value¶
No return value.
Design Considerations¶
This method is a thread that is created when Pool Manager is instantiated. The intent of this thread is to read the pool_manager_status table and perform failed create, delete, and updates operations. Additionally, the thread will call the minidns poll_for_serial_number method for each domain and server to ensure the server is synchronized with Storage.
Every period_sync_interval, this thread will perform the following operations:
Read the pool_manager_status table looking for ‘CREATE’ actions that have a status of ‘ERROR’ grouping by domains and ordering by the row create time. Check to see if a consensus already exists for the domain creation. Loop through each servers with a failed creation, using the backend driver to attempt creation. If consensus does not already exist, check for consensus and call the Central update_status if consensus is achieved.
Read the pool_manager_status table looking for ‘DELETE’ actions that have a status of ‘ERROR’ grouping by domains and ordering by the row create time. Check to see if a consensus already exists for the domain deletion. Loop through each servers with a failed deletion, using the backend driver to attempt deletion. If consensus does not already exist, check for consensus and call the Central update_status if consensus is achieved.
For each domain in the pool, read the domain’s serial number from Storage. Loop through each server in the pool and read the pool_manager_status table looking for ‘UPDATE’ actions for the domain that have a serial number < the domain’s serial number and call the minidns notify_zone_changed method.
Finally, for each domain in the pool, read the domain’s serial number from Storage. Loop through each server in the pool and call the minidns poll_for_serial_number method.
Central Changes¶
The Central service will be updated to use the Pool Manager instead of the backend driver. Additionally, the default_pool_name option will be removed from the [service:central] section of the Designate configuration.
All domains will be ‘PENDING’ status initially and calls to the Central update_status method by the Pool Manager will change the status.
When creating, updating, or deleting records, records will have the serial number field set to the new serial number of the domain. The task will be ‘ADD’, ‘DELETE’, or ‘UPDATE’ corresponding to the operation. The status will be ‘PENDING’.
Valid record states are:
task |
status |
---|---|
‘ADD’ |
‘PENDING’ |
‘ADD’ |
‘ERROR’ |
‘DELETE’ |
‘PENDING’ |
‘DELETE’ |
‘ERROR’ |
‘UPDATE’ |
‘PENDING’ |
‘UPDATE’ |
‘ERROR’ |
‘NONE’ |
‘ACTIVE’ |
‘NONE’ |
‘DELETED’ |
Affected code in the Central service will be updated appropriately to align with these states.
The new method needed to update the status of domains and records is:
update_status(context, domain, status, serial_number)¶
Parameter |
Description |
Required |
---|---|---|
context |
Security context information. |
Yes |
domain |
The designate domain object. |
Yes |
status |
The status, ‘SUCCESS’ or ‘ERROR’. |
Yes |
serial_number |
The consensus serial number for the domain. |
Yes |
Return Value¶
No return value.
Design Considerations¶
If the status is ‘SUCCESS’:
Check the status of the domain and if it has a status of ‘PENDING’ or ‘ERROR’, set the status to ‘ACTIVE’.
Check the status of records for the domain. If they have a task of ‘ADD’ or ‘UPDATE’ and a status of ‘PENDING’ or ‘ERROR’, set the task to ‘NONE’ and the status to ‘ACTIVE’ if the consensus serial number >= serial number field.
Check the status of records for the domain. If they have a task of ‘DELETE’ and a status of ‘PENDING’ or ‘ERROR’, set the task to ‘NONE’ and the status to ‘DELETED’ if the consensus serial number >= serial number field.
If the status is ‘ERROR’:
Check the status of the domain and if it has a status of ‘PENDING’, set the status to ‘ERROR’.
Check the status of records for the domain. If they have a status of ‘PENDING’, set the status to ‘ERROR’ if the consensus serial number >= serial number field.
Cast vs. Call¶
This is an RPC call.
Backend Driver Changes¶
The backend driver will now be instantiated with information provided by the Pool Manager as explained in the Pool Manager Changes section. This is necessary because of server specific backend driver configurations.
The backend driver will continue to support the same configuration options they currently do, only the section names will change by adding a wildcard qualifier for the server. For example, the backend driver section for PowerDNS will now be [backend:powerdns:*]. This syntax will denote the global configuration for the backend driver. This is done to allow for server specific backend driver configurations.
The new server specific backend driver section in the configuration will be [backend:powerdns:<uuid>] where uuid is a universally unique identifier.
The options for this section are:
Parameter |
Default |
Required |
Notes |
---|---|---|---|
host |
None |
Yes |
The host name or IP address of the DNS server |
port |
53 |
Yes |
The port of the DNS server |
tsig_key |
None |
Yes |
The TSIG key for the DNS server |
In addition to the above options, the server specific backend driver section will support the same options as the backend driver global section. If those options are not included in the server specific backend driver section, the server configuration will default to using the global configuration option. These server specific backend driver sections will support different backends in the same pool.
The server object will be implemented. The server object encapsulates the server specific backend driver section in the configuration.
The following methods will not be used in the backend driver:
create_tsigkey(tsigkey)
update_tsigkey(tsigkey)
delete_tsigkey(tsigkey)
This is due to the only provisioner supported initially being the ‘unmanaged’ provisioner. Those methods will be used for future provisioners.
Storage Changes¶
A new table for the Pool Manager status will be needed. Additionally, the domains and records tables will be modified to support pools. Domains and records will be ‘PENDING’ status initially. A new status ‘ERROR’ will be possible for domains and records. Finally, a record can also be ‘DELETE_PENDING’ and ‘DELETE_ERROR’.
New Table - pool_manager_status¶
Column |
Type |
Nullable? |
Unique? |
Notes |
---|---|---|---|---|
id |
CHAR(32) |
No |
Yes |
PK |
updated_at |
DATETIME |
No |
No |
UTC time of last update |
server_id |
VARCHAR(32) |
No |
No |
Server ID |
domain_id |
CHAR(32) |
No |
No |
FK to ID on domains table |
status |
‘SUCCESS’,’ERROR’ |
Yes |
No |
Status |
serial_number |
INT(11) |
No |
No |
Serial number at time of status |
action |
‘CREATE’,’DELETE’,’UPDATE’ |
No |
No |
Action resulting in status |
Modify Table - domains¶
Column |
Type |
Nullable? |
Unique? |
Default |
Notes |
Action |
---|---|---|---|---|---|---|
status |
‘ACTIVE’,’PENDING’,’DELETED’,’ERROR’ |
No |
No |
‘PENDING’ |
Record status |
update |
Modify Table - records¶
Column |
Type |
Nullable? |
Unique? |
Default |
Notes |
Action |
---|---|---|---|---|---|---|
serial_number |
INT(11) |
No |
No |
Used for the record status |
add |
|
task |
‘ADD’,’DELETE’,’UPDATE’,’NONE’ |
No |
No |
‘ADD’ |
Record operation task |
add |
status |
‘ACTIVE’,’PENDING’,’DELETED’,’ERROR’ |
No |
No |
‘PENDING’ |
Record status |
update |
Other Changes¶
None
Alternatives¶
None
Implementation¶
Assignee(s)¶
- Primary assignee:
- Additional assignee:
Milestones¶
- Target Milestone for completion:
Kilo-1
Work Items¶
Pool Manager changes
Central changes
Backend driver changes
Storage changes
Dependencies¶
This specification relies on the Server Pools - Storage specification. This specification relies on the Server Pools - MiniDNS Support specification.