Docker Community Forums

Share and learn in the Docker community.

Feature Request: Primary/Backup tasks in a service


(Ktwalrus) #1

Its great to have load balancing built in, but I have some services that I want to deploy in primary/backup configuration. For example, an instance of ProxySQL is placed in front of the database instances. Currently, I run HAProxy instances (one for each web server instance that connects to a database) with these HAProxy instances configured to proxy all database connections to two ProxySQL instances. The HAProxy config specifies one ProxySQL instance as “primary” and one as “backup” (using the keyword BACKUP in the SERVER line).

I want all database connections to go through the primary ProxySQL instance and only use the backup instance if the primary instance fails (usually because the node it runs on fails for some reason - the nodes are Cloud VMs and they sometimes fail randomly and have to be replaced).

So, my feature request for Docker 1.13 is to add a “backup” designation that can be specified on 1 or more of the tasks in a service and have the load balancing send traffic to the primary tasks and failover to the backup tasks if no primary tasks are currently available to handle the requests.

HAProxy also has a nice feature that you can set the number of healthchecks that fail before a server is marked down and the number of healthchecks that succeed before a downed server is marked as up.

Without the notion of Primary/Backup for tasks, I think I will need to run two services of ProxySQL with each service having 1 task and continue to insert HAProxy instances between the apps and the ProxySQL services. If Docker 1.13 had support for Primary/Backup tasks within a single service, I could get rid of the extra HAProxy instances and use the built in load balancing that swarm mode gives me.


(Ktwalrus) #2

BTW, I assume load balancing is currently implemented using Docker’s internal DNS to associate the tasks’ IPs with the service’s name. If so, marking a set of tasks as “Backup” would be a matter of associating meta-data with the tasks’ IPs so the DNS could use this meta-data when constructing the list of IPs returned for the service name. If this is the case, then the meta-data could have Interval/Rise/Fall/State data associated with each task. This “State” data could be used mark the task as “going down for maintenance”, “down”, “up”, etc and Docker could expose an API to have a script or operator to take a task offline temporarily for maintenance (like I can with HAProxy).

I would like to be able to control stopping an individual task (specifying when the container is actually stopped by giving a delay interval between the last time a request was sent to the container and when the container is stopped).

For my website, I’m planning on taking each database offline, updating the database (potentially from a backup of a newer version of the database), and then restarting the database so the task is available in the DNS again. Essentially, I need to do a rolling update, but not to a new version of the image but a new version of the persistent data.

Another feature I need is to deploy my databases in master/slaves configuration (I’m using MySQL 5.7 for the databases). I want to be able to do “one-way” failover from a failed master to the most up to date slave. It would be nice if I could deploy both the master and slaves in a single service to do the “one-way” failover within Docker using the internal DNS. Once a master fails (a number of healthchecks), I essentially want to failover to a slave as a new master and never try to automatically return to the failed master (even if it starts to pass health checks again). The DBA will manually restore the failed master as a slave to the new master (the promoted slave) so it could then be selected if the new master were to fail again.

Not sure how all this should be implemented in Docker, but it would be really something if Docker had built-in support for these features (primary/backup, downing tasks temporarily for maintenance, “one-way” failover for tasks, etc.).