How to Rate Limit HTTP Requests
If you're running a HTTP server and want to rate limit user requests, the go-to package to use is probably Tollbooth by Didip Kerabat. It's well maintained, has a good range of features and a clean and clear API.
But if you want something simple and lightweight – or just want to learn – it's not too difficult to roll your own middleware to handle rate limiting. In this post I'll run through the essentials of how to do that by using the
x/time/rate package, which provides a token bucket rate-limiter algorithm (note: this is also used by Tollbooth behind the scenes).
If you would like to follow along, create a demo directory containing two files,
main.go, and initialize a new Go module. Like so:
Let's start by making a global rate limiter which acts on all the requests that a HTTP server receives.
Open up the
limit.go file and add the following code:
In this code we've used the
rate.NewLimiter() function to initialize and return a new rate limiter. Its signature looks like this:
From the documentation:
A Limiter controls how frequently events are allowed to happen. It implements a "token bucket" of size b, initially full and refilled at rate r tokens per second.
Or to describe it another way – the limiter permits you to consume an average of r tokens per second, with a maximum of b tokens in any single 'burst'. So in the code above our limiter allows 1 token to be consumed per second, with a maximum burst size of 3.
limit middleware function we call the global limiter's
Allow() method each time the middleware receives a HTTP request. If there are no tokens left in the bucket
Allow() will return
false and we send the user a
429 Too Many Requests response. Otherwise, calling
Allow() will consume exactly one token from the bucket and we pass on control to the next handler in the chain.
It's important to note that the code behind the
Allow() method is protected by a mutex and is safe for concurrent use.
Let's put this to use. Open up the
main.go file and setup a simple web server which uses the
limit middleware like so:
Go ahead and run the application…
And if you make enough requests in quick succession, you should eventually get a response which looks like this:
Rate limiting per user
While having a single, global, rate limiter is useful in some cases, another common scenario is implement a rate limiter per user, based on an identifier like IP address or API key. In this post we'll use IP address as the identifier.
A conceptually straightforward way to do this is to create a map of rate limiters, using the identifier for each user as the map key.
At this point you might think to reach for the
sync.Map type that was introduced in Go 1.9. This essentially provides a concurrency-safe map, designed to be accessed from multiple goroutines without the risk of race conditions. But it comes with a note of caution:
It is optimized for use in concurrent loops with keys that are stable over time, and either few steady-state stores, or stores localized to one goroutine per key.
For use cases that do not share these attributes, it will likely have comparable or worse performance and worse type safety than an ordinary map paired with a read-write mutex.
In our particular use-case the map keys will be the IP address of users, and so new keys will be added to the map each time a new user visits our application. We'll also want to prevent undue memory consumption by removing old entries from the map when a user hasn't been seen for a long period of time.
So in our case the map keys won't be stable and it's likely that an ordinary map protected by a mutex will perform better. (If you're not familiar with the idea of mutexes or how to use them in Go, then this post has an explanation which you might want to read before continuing).
Let's update the
limit.go file to contain a basic implementation. I'll keep the code structure deliberately simple.
Removing old entries from the map
There's one problem with this: as long as the application is running the
visitors map will continue to grow unbounded.
We can fix this fairly simply by recording the last seen time for each visitor and running a background goroutine to delete old entries from the map (and therefore free up memory as we go).
Some more improvements…
For simple applications this code will work fine as-is, but you may want to adapt it further depending on your needs. For example, it might make sense to:
- Check the
X-Real-IPheaders for the IP address, if you are running your server behind a reverse proxy.
- Port the code to a standalone package.
- Make the rate limiter and cleanup settings configurable at runtime.
- Remove the reliance on global variables, so that different rate limiters can be created with different settings.
- Switch to a
sync.RWMutexto help reduce contention on the map.