Configuring sql.DB for Better Performance
There are a lot of good tutorials which talk about Go's
sql.DB type and how to use it to execute SQL database queries and statements. But most of them gloss over the
SetConnMaxLifetime() methods — which you can use to configure the behavior of
sql.DB and alter its performance.
In this post I'd like to explain exactly what these settings do and demonstrate the (positive and negative) impact that they can have.
Open and idle connections
I'll begin with a little background.
sql.DB object is a pool of many database connections which contains both 'open' and 'idle' connections. A connection is marked open when you are using it to perform a database task, such as executing a SQL statement or querying rows. When the task is complete the connection becomes idle.
When you instruct
sql.DB to perform a database task, it will first check if any idle connections are already available in the pool. If one is available then Go will reuse the existing connection and mark it as open for the duration of the task. If there are no idle connections in the pool when you need one then Go will create a new additional connection and 'open' it.
The SetMaxOpenConns method
By default there's no limit on the number of connections that can be open at the same time. But you can implement your own limit via the
SetMaxOpenConns() method like so:
In this example code the pool now has a maximum limit of 5 concurrently open connections. If 5 connections are already open and another new connection is needed, then the application will be forced to wait until one of the 5 open connections is freed up and becomes idle.
To illustrate the impact of changing
MaxOpenConns I ran a benchmark test with the maximum open connections set to 1, 2, 5, 10 and unlimited. The benchmark executes parallel
INSERT statements on a PostgreSQL database and you can find the code in this gist. Here's the results:
For this benchmark we can see that the more open connections that are allowed, the less time is taken to perform the
INSERT on the database (3129633 ns/op with 1 open connection compared to 531030 ns/op for unlimited connections — about 6 times quicker). This is because the more open connections that there are, the less time (on average) the benchmark code needs to wait for an open connection to be freed and made idle (ready for use) again.
The SetMaxIdleConns method
sql.DB allows a maximum of 2 idle connections to be retained in the connection pool. You can change this via the
SetMaxIdleConns() method like so:
In theory, allowing a higher number of idle connections in the pool will improve performance because it makes it less likely that a new connection will need to be established from scratch — therefore helping to save resources.
Lets take a look at the same benchmark with the maximum idle connections is set to none, 1, 2, 5 and 10 (and the number of open connections is unlimited):
MaxIdleConns is set to none, a new connection has to be created from scratch for each
INSERT and we can see from the benchmarks that the average runtime and memory usage is comparatively high.
Allowing just 1 idle connection to be retained and reused makes a massive difference to this particular benchmark — it cuts the average runtime by about 8 times and reduces memory usage by about 20 times. Going on to increase the size of the idle connection pool makes the performance even better, although the improvements are less pronounced.
So we should you maintain a large idle connection pool? The answer is it depends on the application.
It's important to realise that keeping an idle connection alive comes at a cost — it takes up memory which can otherwise be used for both your application and the database.
It's also possible that if a connection is idle for too long then it may become unusable. For example, MySQL's
wait_timeout setting will automatically close any connections that haven't been used for 8 hours (by default).
When this happens
sql.DB handles it gracefully. Bad connections will automatically be retried twice before giving up, at which point Go will remove the connection from the pool and create a new one. So setting
MaxIdleConns too high may actually result in connections becoming unusable and more resources being used than if you had a smaller idle connection pool (with fewer connections that are used more frequently). So really you only want to keep a connection idle if you're likely to be using it again soon.
One last thing to point out is that
MaxIdleConns should always be less than or equal to
MaxOpenConns. Go enforces this and will automatically reduce
MaxIdleConns if necessary. This Stack Overflow post nicely describes why:
There is no point in ever having any more idle connections than the maximum allowed open connections, because if you could instantaneously grab all the allowed open connections, the remain idle connections would always remain idle. It's like having a bridge with four lanes, but only ever allowing three vehicles to drive across it at once.
The SetConnMaxLifetime method
Let's now take a look at the
SetConnMaxLifetime() method which sets the maximum length of time that a connection can be reused for. This can be useful if your SQL database also implements a maximum connection lifetime or if — for example — you want to facilitate gracefully swapping databases behind a load balancer.
You use it like this:
In this example all our connections will 'expire' 1 hour after they were first created, and cannot be reused after they've expired. But note:
- This doesn't guarantee that a connection will exist in the pool for a whole hour; it's quite possible that the connection will have become unusable for some reason and been automatically closed before then.
- A connection can still be in use more than one hour after being created — it just cannot start to be reused after that time.
- This isn't an idle timeout. The connection will expire 1 hour after it was first created — not 1 hour after it last became idle.
- Once every second a cleanup operation is automatically run to remove 'expired' connections from the pool.
In theory, the shorter
ConnMaxLifetime is the more often connections will expire — and consequently — the more often they will need to be created from scratch.
To illustrate this I ran the benchmarks with
ConnMaxLifetime set to 100ms, 200ms, 500ms, 1000ms and unlimited (reused forever), with the default settings of unlimited open connections and 2 idle connections. These time periods are obviously much, much shorter than you'd use in most applications but they help illustrate the behaviour well.
In these particular benchmarks we can see that memory usage was more than 3 times greater with a 100ms lifetime compared to an unlimited lifetime, and the average runtime for each
INSERT was also slightly longer.
Exceeding connection limits
Lastly, this article wouldn't be complete without mentioning what happens if you exceed a hard limit on the number of database connections.
As an illustration, I'll change my
postgresql.conf file so only a total of 5 connections are permitted (the default is 100)...
And then rerun the benchmark test with unlimited open connections...
As soon as the hard limit of 5 connections is hit my database driver (pq) immediately returns a
sorry, too many clients already error message instead of completing the
To prevent this error we need to set the total maximum of open and idle connections in
sql.DB to comfortably below 5. Like so:
Now there will only ever be a maximum of 3 connections created by
sql.DB and the benchmark should run without any errors.
But doing this comes with a big caveat: when the open connection limit is reached, any new database tasks that your application needs to execute will be forced to wait until a connection becomes free.
For some applications that behavior might be fine, but for others it might not. For example, in a web application it could arguably be better to immediately log the error message and send a
500 Internal Server Error to the user, rather than having their HTTP request hang and potentially timeout while waiting for a free connection.
For most applications using
SetMaxOpenConns()to limit the maximum of open connections will have a negative impact on performance, but it may make sense to do so if your database is operating in a particularly resource-constrained environment.
If your application is bursty or regularly executing more than 2 database tasks concurrently then there will probably be a positive performance impact from increasing the size of the idle connection pool via
SetMaxIdleConns(). But be aware that making it too large may be counter-productive. Don't go crazy with it, and benchmarking real-world load is probably a good idea.
For most applications implementing a connection lifetime limit via
SetConnMaxLifetime()will have a negative impact on performance. But if your database itself enforces a short connection lifetime then it makes sense to match this in
sql.DBto avoid the overhead of trying-and-retrying bad connections.
If you prefer for your application to wait for a connection to be freed up (rather than return an error) when your database's hard limit of connections is reached then you should explicitly set both