data:image/s3,"s3://crabby-images/c8703/c87039fe89b319f95322eceae07060c3efba47cd" alt=""
How to use the “jobs” and “clients” parameters in pgbench without going crazy.
pgbench paramaters for concurrency control
pgbench offers two parameters for controlling the concurrency in the benchmark. Namely:
- -j for ‘jobs’. The number of pgbench threads to run.
-j, --jobs=NUM number of threads (default: 1)
- -c for ‘clients’. The number of “postgres” processes to run.
-c, --client=NUM number of concurrent database clients (default: 1)
Here are the TPS delivered for a simple combination of -j(1,10) and -c(1,10) using a very small (cached) database.
The machine is a GCP instance (e2-standard-8 (8 vCPUs, 32 GB memory). The database size is tiny (Scale Factor 100).
pgbench read-only test.
Firstly I ran pgbench with the -S flag “Select Only” to avoid having the disk be a bottleneck. In this experiment we are mainly interested in the concurrency options.
-c=1 | -c=10 | |
-j=1 | 11,385 | 61,267 |
-j=10 | 15,416 | 75,106 |
The result shows that the number of “clients” (postgres client processes) is the clear dominant factor. With the tiny DB and 8 cores a single pgbench thread (-j=1) is almost able to saturate the 8 cores. With j=1 and c=10 there was about 20% idle across all the cores.
data:image/s3,"s3://crabby-images/f927d/f927d650784b5506362719b3a7fcc4d12474ab7a" alt=""
With 10 pgbench threads and 10 postgres client processes (-j=10 -c=10) all 8 cores were 100% saturated
data:image/s3,"s3://crabby-images/0b686/0b686c27c2f01f0bad0bb0ee7709ec8cfd5cfff0" alt=""
pgbench read/write test.
For completeness I re-ran the experiment without the “-S” option. The GCP instance had a single disk and was easily overwhelmed by the amount of IO generated by 8 cores at full blast. At any rate the number of postgres client processes (-c=10) is the clear dominant factor – albeit at a much lower TPS rate (due to the fact that so much time is spent waiting on disk).
-c=1 | -c=10 | |
-j=1 | 683 | 3,111 |
-j=10 | 713 | 3,251 |
data:image/s3,"s3://crabby-images/28d20/28d2074a5a45cc34abd55f037440ee66efaa3db7" alt=""
What’s really interesting here is that most of the cores are showing “idle” rather than IO wait. I believe that the postgres threads must be waiting on a single writer thread to finish disk IO before they can continue (via lock. or cv_wait. So, in reality all the CPU’s/Threads are blocked on IO, but not directly so the kernel does not know to show that the CPU’s could be doing more work if the IO were faster.