my solution
functional requirements
- Prevent DDOS attacks.
- Given a request, determine if the specific request should be throttled, and either let it through or deny it.
non-functional requirements
- Available.
- If the service is not available, we just have to let requests through, meaning that we’ll be open to DDOS attacks and other issues.
- Low latency.
- Because this will sit in front of every service, it needs to be low latency to not slow down operations and user requests.
- Scalable.
- Should scale along with rest of the application as requests start to increase.
api design
shouldAllowRequest(request)
high level design
- how do we identify users?
- unique identifier such as ip address or user ids.
- what logic do we actually use to rate limit?
- tokens in a bucket.
- have tokens regenerate for each user per time unit (up to a maximum number of tokens).
- each time a request comes in, use a token if it’s available.
- drop the request if there’s no tokens available currently.
- maximum number of requests in a time window.
- can be bad if a lot of requests come at the beginning of the window, because no requests will be allowed for the rest of the window.
- components.
- rule engine to specify tunable parameters.
- high throughput cache to store all the data of each user’s requests.
- logging service.
scaling
- Considerations:
- How to share rate limiting state between distributed rate limiting nodes?
- Nodes could talk to each other.
- Host discovery. When new nodes are added or removed, how do the other nodes know about it? Could use gossip protocol.
- Centralized master node that faciliates.
- Distributed cache that all the nodes read/write from.
positive feedback
critique
references