Replicante uses a distributed coordinator for a variety of reasons.
This page aims to keep track of all uses of coordination. Distributed coordination (especially locks) is a delicate thing, and are very easy to get wrong!
Some components are special and must be executed exclusively across the cluster. Yet we want more then one copy of them running so if the primary process fails a copy can take over.
Distributed coordination is used to achieve this:
The implementation details may very over time and based on backends (Consul vs Zookeeper).
Some tasks may be scheduled too frequently or otherwise enqueued too often. While in general this is not a problem, some tasks with side effects may cause issues when run in parallel on the same inputs.
For these cases, tasks that should not be run in parallel acquire a lock at the start. If the lock is acquired, the task proceeds as normal. If the lock is already taken by another executor, the task is dropped.