design distributed key-value store

my solution

functional requirements

A user can retrieve the value for a key, if they have read credentials.
User can save or update the value of a key that they have the write credentials for.

non-functional requirements

Value retrieval with a key should be low latency.
Consistency guarantees.
- Depends on the use case, for sensitive/high-stakes use cases, we’d probably want to guarantee some sort of strong consistency using 2PC (two phase commit).
- For cases where many users are writing keys to the key-value store and most users only care about a few keys, simple causal consistency guarantees like read-your-writes would probably be sufficient.
We can likely assume that we’ll have many more reads than writes, as the typical use-case for key-value stores is to store values and reference them later with low latency/without recomputing.
- Can confirm with interviewer if this is an acceptable assumption, because there are definitely times when this isn’t the case.

api design

create_or_update(key, value)
get_value(key)

design specifics

How do we avoid conflicting keys from concurrent writes?
- Using a single-leader model for each partition, we can use a simple LWW (last-write-wins) heuristic to fail any concurrent request that uses the same key as a previous request.
  - Because there is a single-leader, we can maintain total global ordering of requests without sacrificing latency with something like two-phase commit if we had multiple leaders.
How do we construct keys? Do we just use the ones that the user provides?¹
- I think this is fine? We can partition the keys by alphabetical grouping.
Partitioning details:
- If keys are not distributed evenly by alphabetical order, we could consider hashing to evenly space the values (we could also consider consistent hashing if we need to frequently change the number of active partitions).

positive feedback

Single leader replication with partitioning was good for a more consistent approach.
Consistent hashing was a good shout.

critique

api design

Didn’t consider breaking up the key-value store by table.
Didn’t consider other operations like deleting a key, or retrieving a list of keys.

design

Instead of using LWW, we could utilize a unique monotonic ID to handle conflicts/versioning.
Missing some sort of high-level service/coordinator that manages partitioning groups, as well as leader assignment/failover.
- Heartbeats with the distributed manager to ensure that leaders are still up.
  - See things that can go wrong
- Can be aware of hot spots and potentially split partitions into smaller partitions. (Consistent hashing would be useful here).
- Zookeeper, Redis (consensus deriving services).
Didn’t talk about backups for fault tolerance.
Could’ve discussed trade-offs for database technology decisions (b-tree vs. lsm-tree-and-sstable-index.)

high consistency design

We can utilize both synchronous and asynchronous replication in a cluster, directing queries that require high consistency/low latency to their respective clusters.
We can have a maximum latency cut-off for replicated followers, and not read from them if they’re too far behind, until they catch up.

follow-ups

Research RAFT protocol for replication.

references

I would ask the interviewer what the expected use case for this application/key-value store is. If it is for a single system to use, we could continue with allowing the application to figure out how keep their own keys from conflicting. However, it’s for users anywhere to store a value in a global key-value store, we could use some unique user identifier appended to the user’s key. We could then partition by user locations or IP addresses. ↩

🗿

design distributed key-value store

my solution

functional requirements

non-functional requirements

api design

design specifics

positive feedback

critique

meta

api design

design

high consistency design

follow-ups

references

Graph View

Table of Contents

Backlinks

🗿

design distributed key-value store

my solution

functional requirements

non-functional requirements

api design

design specifics

positive feedback

critique

meta

api design

design

high consistency design

follow-ups

references

Footnotes

Graph View

Table of Contents

Backlinks