distributed-systems

DottedDB: Anti-Entropy without Merkle Trees, Deletes without Tombstones

Initial

  • Dynamo like key-value store
  • Uses node-wide logical clock framework
  • Overcomes the following fundamental limitations
    • Minimize the metadata per key necessary to track causality
    • Correctly and durably delete keys, without tombstones
    • offer a lightweight anti-entropy mechanism to converge replicated data without merkle trees.
  • Nodes that share replicas of some object are called peer nodes.

Anti-Entropy Issues with merkle trees

  • merkle trees are commonly used but there is a huge performance impact in the following ways:
    • If the merge tree is too large (or too deep), network payloads can be huge
    • Smaller merkle trees incur the cost more false positives (transferring data which does not have to repaired along with data that needs repair)
  • Some systems have anti-entropy is turned off by default, with convergence limited to read-repair: Relevant replicas are updated when nodes detect inconsistencies while serving a client read.

Node-wide Dot-based Clocks framework (NDC)

Unknowns

  • version vectors in dynamo db for conflicts
  • Cassandra wall clock timestamps
  • Riak dotted version vectors → Large number of concurrent writes
  • Per-object causal consistency vs whole system cross-object causal consistency
  • Casual consistency
  • Version vectors, dotted version vectors