Lock validation before writes
One of the biggest problems with coordination in distributed systems is dealing with isolation
from the distributed coordination system.
Several reasons may cause isolation and the ideal response may be different.
Unfortunately, some reasons look the same to the application but may be very different:
- The application is partitioned away from the coordinator.
The coordinator is still running but thinks we are not so it will release resources
we think are currently entitled to have.
This may lead to mutual exclusion violations for locks, multiple primaries, etcetera.
- The coordination service is down.
In this case the application will still be running but fail to check in with the coordinator
and choose to relinquish control of acquired resources.
This would lead to a spontaneous denial of services: the servers could provide whichever
service it is supposed to but because they can’t decide which one of them should nobody does.
- An application server is partitioned with a coordinator server.
The result here depends mostly on what the coordination system does.
If the coordination system detects it is unable to provide its services to the application
we likely fallback to the case where the application thinks the coordinator is down.
If the coordination system ignores the fact that it is in the minority … you should stop using it!
Replicante chooses consistency over availability: if the coordinator is not responsive, application
processes will assume they have no right to the resources they have and stop working.
The problem is the application process can’t ensure the lock is held and the coordination works before
a write operation is performed.
This is because between the check and the write the lock may be lost for a number of reasons.
At the same time the code complexity grows fast and performance of both application and coordinator
decrease when the process attempts to ensure it holds the lock very often.
Replicante checks it holds locks before operations are performed and, for long held locks,
periodically in reasonable places (at the beginning and/or end of tasks).
This means the window of opportunity for the lock to be though as held by the application
but not by the coordinator is limited but still large.
Why delay improving?
- This has not caused issues yet.
- There is no definitive solutions, just ways to make it slightly better.
Potential improvement
Replicante could check its locks before every write operation to reduce that window
of opportunity to a write operation alone.
Downsides the new solution/idea
- Overly complex.
- Repeated checks are expensive.
- Still does not guarantee correctness.