Introducing Postgres Hibernator
As it must have been obvious from my last post that I wasn’t really pleased by the amount of work needed to implement hibernation of Postgres shared-buffers, so I set out to implement a seamless Postgres hibernation solution.
A couple of hours ago I published the Postgres/EDB extension I had been working on for last 10 days or so, in my spare time. Following are the contents of the README file from the extension.
Postgres Hibernator
This Postgres extension is a set-it-and-forget-it solution to save and restore the Postgres shared-buffers contents, across Postgres server restarts.
For some details on the internals of this extension, also see the proposal email to Postgres hackers’ mailing list.
Why
When a database server is shut down, for any reason (say, to apply patches, for scheduled maintenance, etc.), the active data-set that is cached in memory by the database server is lost. Upon starting up the server again, the database server’s cache is empty, and hence almost all application queries respond slowly because the server has to fetch the relevant data from the disk. It takes quite a while for the server to bring the cache back to similar state as before the server shutdown.
The duration for which the server is building up caches, and trying to reach its optimal cache performance is called ramp-up time.
This extension is aimed at reducing the ramp-up time of Postgres servers.
How
Compile and install the extension (of course, you’d need Postgres installation or source code):
$ make -C pg_hibernate/ install
Then.
- Add
pg_hibernate
to theshared_preload_libraries
variable inpostgresql.conf
file. - Restart the Postgres server.
- You are done.
How it works
This extension uses the Background Worker
infrastructure of Postgres, which was
introduced in Postgres 9.3. When the server starts, this extension registers
background workers; one for saving the buffers (called Buffer Saver
) when the
server shuts down, and one for each database in the cluster (called Block Readers
)
for restoring the buffers saved during previous shutdown.
When the Postgres server is being stopped/shut down, the Buffer Saver
scans the
shared-buffers of Postgres, and stores the unique block identifiers of each cached
block to the disk (with some optimizatins). This information is saved under the
$PGDATA/pg_hibernate/
directory. For each of the database whose blocks are
resident in shared buffers, one file is created; for eg.:
$PGDATA/pg_hibernate/2.postgres.save
.
During the next startup sequence, the Block Reader
threads are registerd, one for
each file present under $PGDATA/pg_hibernate/
directory. When the Postgres server
has reached stable state (that is, it’s ready for database connections), these
Block Reader
processes are launched. The Block Reader
process reads the save-files
looking for block-ids to restore. It then connects to the respective database,
and requests Postgres to fetch the blocks into shared-buffers.
Caveats
- It saves the buffer information only when Postgres server is shutdown in normal mode.
- It doesn’t save/restore the filesystem/kernel’s disk cache.