PostgreSQL: improving pg_dump, pg_restore performance

When I began, I used pg_dump with the default plain format. I was unenlightened.

Research revealed to me time and file size improvements with pg_dump -Fc | gzip -9 -c > dumpfile.gz. I was enlightened.

When it came time to create the database anew,

# create tablespace dbname location '/SAN/dbname';
# create database dbname tablespace dbname;
# alter database dbname set temp_tablespaces = dbname;

% gunzip dumpfile.gz              # to evaluate restore time without a piped uncompression
% pg_restore -d dbname dumpfile   # into a new, empty database defined above

I felt unenlightened: the restore took 12 hours to create the database that’s only a fraction of what it will become:

# select pg_size_pretty(pg_database_size('dbname'));
47 GB

Because there are predictions this database will be a few terabytes, I need to look at improving performance now.

Please, enlighten me.

Answer

First check that you are getting reasonable IO performance from your disk setup. Then check that you PostgreSQL installation is appropriately tuned. In particular shared_buffers should be set correctly, maintenance_work_mem should be increased during the restore, full_page_writes should be off during the restore, wal_buffers should be increased to 16MB during the restore, checkpoint_segments should be increased to something like 16 during the restore, you shouldn’t have any unreasonable logging on (like logging every statement executed), auto_vacuum should be disabled during the restore.

If you are on 8.4 also experiment with parallel restore, the –jobs option for pg_restore.

Attribution
Source : Link , Question Author : Joe Creighton , Answer Author : Ants Aasma

Leave a Comment