We recently upgraded our network to 10 Gbit and were really hoping to see monumental speed increases in our ceph cluster.
One of our benchmarks was
pgbench and to say we were sad
would be an understatement…
I created a database and ran tests like so…
createdb -U postgres bench pgbench -U postgres -i -d bench pgbench -U postgres -d bench -c 70 2> /dev/null
And here's a typical run. Latency varied a lot but never less than 300ms.
latency average = 951.736 ms tps = 73.549842 (including connections establishing)
Yeah, so a four node ceph cluster with 12 OSD was getting 73 TPS with a second of latency. Did somebody swap out my drives with a floppy disk?!?
This was so horrifically bad it put in jeaporady an entire 5 year forecast of our tech stack…
Now I don't claim to be a ceph or postgres expert but here's what I tried.
# fsync = off latency average = 50.641 ms tps = 1382.270643 (including connections establishing)
So that's groovy and all but it's totally unsafe and not appropriate for production. The good news was that this proved that the ceph cluster itself wasn't utterly broken.
So then I noticed the next line in
# synchronous_commit = on # synchronization level;
So I tried turning that off and…
latency average = 48.036 ms tps = 1457.230219 (including connections establishing)
Basically equivalent speeds to
fsync = off.
pretty strange when there is no replication…
I also tried with
synchronous_commit = local and got
latency average = 225.714 ms tps = 310.127180 (including connections establishing)
Which is great compared to the original results but still dismal.
For our use case maybe losing a couple of transactions is worth 10x speed improvement.
The really strange bit is that we aren't running replication on our test
synchronous_commit shouldn't affect anything as per my
But as I said, I'm not a postgresql expert. But all this really
begs the question, wtf is ceph doing that causes
fsyncs to be
so slow? Too bad I'm not a ceph expert either…
Here's a good page explaining
tl;dr - change
synchronous_commit = off