We recently upgraded our network to 10 Gbit and were really hoping to see monumental speed increases in our ceph cluster.
One of our benchmarks was pgbench
and to say we were sad
would be an understatement…
I created a database and ran tests like so…
createdb -U postgres bench
pgbench -U postgres -i -d bench
pgbench -U postgres -d bench -c 70 2> /dev/null
And here's a typical run. Latency varied a lot but never less than 300ms.
latency average = 951.736 ms
tps = 73.549842 (including connections establishing)
Yeah, so a four node ceph cluster with 12 OSD was getting 73 TPS with a second of latency. Did somebody swap out my drives with a floppy disk?!?
This was so horrifically bad it put in jeaporady an entire 5 year forecast of our tech stack…
Now I don't claim to be a ceph or postgres expert but here's what I tried.
# fsync = off
latency average = 50.641 ms
tps = 1382.270643 (including connections establishing)
So that's groovy and all but it's totally unsafe and not appropriate for production. The good news was that this proved that the ceph cluster itself wasn't utterly broken.
So then I noticed the next line in postgresql.conf
# synchronous_commit = on # synchronization level;
So I tried turning that off and…
latency average = 48.036 ms
tps = 1457.230219 (including connections establishing)
wut! wut!
Basically equivalent speeds to fsync = off
.
pretty strange when there is no replication…
I also tried with synchronous_commit = local
and got
latency average = 225.714 ms
tps = 310.127180 (including connections establishing)
Which is great compared to the original results but still dismal.
For our use case maybe losing a couple of transactions is worth 10x speed improvement.
The really strange bit is that we aren't running replication on our test
machine so synchronous_commit
shouldn't affect anything as per my
understanding…
But as I said, I'm not a postgresql expert. But all this really
begs the question, wtf is ceph doing that causes fsync
s to be
so slow? Too bad I'm not a ceph expert either…
Here's a good page explaining syncronous_commit
, https://www.tutorialdba.com/2018/04/how-to-improve-performance-of.html
tl;dr - change synchronous_commit = off