Alpha state disclaimer
The model defined below is in early development cycle and is subject to (potentially breaking) change.
Any (collection of) software that fulfils the requirements and expectations of the model defined in this document is considered a datastore.
The difficulty in defining a model is to find the balance between generality and specificity:
This aims of the model are:
The datastore MUST provide the following administration information:
The datastore MAY provide the following administration information:
The datastore MUST support clustering by running a process on one or more (virtual or physical) machine. Each process in the cluster is a node.
Note that there is no requirement for the process be the same everywhere in the cluster (same applies to how nodes are configured). This allows the cluster to have heterogeneous components as long as they all follow the model.
The datastore MUST organize the data in one or more shard. Shards are independent units of data, each with their own primary and secondaries nodes.
All datastores have at least one shard.
For datastores that do not support sharding, the entire dataset can be seen as a single shard.
For each shard in the cluster the datastore MUST provide the following information:
Each node in the cluster SHOULD provide the following information for each:
The datastore MUST support a primary/secondaries replication system. This means that each shard at any given time has at most one primary node with zero or more secondary nodes that replicate the data.
Each node in the cluster MUST provide the following information:
Some details about replication require the cluster to be healthy enough to report such data. Such details may also be expensive to compute or, worse, require connections to non-local nodes.
This information should be provided whenever possible as long as:
Each node in the cluster SHOULD provide the following information: