Do you have billions of rows with only a very small % in use/locked or millions of rows that are jsonb blobs with a relatively high % in use/locked?
Is your workload mostly write once read lots or read/write evenly split etc.
How fast is your IO. Are you using network storage (EBS for example) vs local NVMe?
How much other load are you contending with on the system.
I have a JSONB heavy workload with lots of updates. JSONB blobs avg around 100KB. But can go up to 20MB. We can see < 10 updates on a blob all the way to thousands of updates on the larger ones.
We use Citus for this workload and can move shards around and that operation will use logical replication to another host effectively cleaning up the bloat that way.
We also have some wide multi-column indexes over text fields and date fields that see a fair bit of rewrite activity as well. Those indexes get bloated a fair bit too and we run re-indexes every few months (used to be every 6-12 months, but as the workload is getting busier we're re-indexing a bit more frequently now).
The index bloat is far more of a problem than the raw table bloat.
In the past I would use pg_repack when we were on a single RDS solution.
Do you have billions of rows with only a very small % in use/locked or millions of rows that are jsonb blobs with a relatively high % in use/locked?
Is your workload mostly write once read lots or read/write evenly split etc.
How fast is your IO. Are you using network storage (EBS for example) vs local NVMe?
How much other load are you contending with on the system.
I have a JSONB heavy workload with lots of updates. JSONB blobs avg around 100KB. But can go up to 20MB. We can see < 10 updates on a blob all the way to thousands of updates on the larger ones.
We use Citus for this workload and can move shards around and that operation will use logical replication to another host effectively cleaning up the bloat that way.
We also have some wide multi-column indexes over text fields and date fields that see a fair bit of rewrite activity as well. Those indexes get bloated a fair bit too and we run re-indexes every few months (used to be every 6-12 months, but as the workload is getting busier we're re-indexing a bit more frequently now).
The index bloat is far more of a problem than the raw table bloat.
In the past I would use pg_repack when we were on a single RDS solution.