Rate Limiting - Developer Roadmap

To implement rate limiting, you have to keep the following points in mind:

Define your limits. Define exactly the amount of requests a client can make. This can be measured in requests per minute, per day, or per second.
Choose a limiting strategy. Pick a rate-limiting algorithm, like the fixed window counter, sliding log window, token bucket, or leaky bucket. You can read more about these algorithms here.
Store your counters somewhere. Use a fast data store (like Redis) to keep track of the number of requests or timestamps for each client.
Once the limit is reached, try to respond with a standard status code, such as 429 which indicates there have been “Too Many Requests”.

If you want to take this further, you can look into using an existing API Gateway that already provides this functionality or look into adding support for sudden bursts of traffic to avoid penalizing clients that are slightly above the limits every once in a while.