Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Partitioning Child Partitions by Different Columns #731

Open
nathanmascitelli opened this issue Jan 9, 2025 · 5 comments
Open
Assignees

Comments

@nathanmascitelli
Copy link

Our application handles deletes 'softly', that is when a user indicates something should be deleted it is marked as deleted but not actually removed from the database until some retention peroid has passed.

As an example consider a table with the following columns

  • A timestamp representing the time of the record (record_timestamp)
  • A boolean indicating if the record has been deleted (deleted_bool)
  • A timestamp indicating when the record was deleted (deleted_timestamp)

Currently the table is partitioned by record_timestamp for performance reasons. However, we now want to split this table into a set of deleted records and a set of live records. The idea is to partition the table first based on deleted_bool and then partition the live child partition by record_timestamp and the deleted child partition by deleted_timestamp. This would simplify removing old records as we could just drop the oldest partition when it got too old on the deleted side.

Splitting the table into a set of live and dead records seems like it can be done with partman today if we change the deleted_bool column from a bool to an integer and just constrain the range of partitions to 0 and 1. However I cannot see a way to partition the live side by one column and the deleted side by another column. Is this possible? If not, is there any interest in adding it? I understand its a very niche request.

@keithf4
Copy link
Collaborator

keithf4 commented Jan 9, 2025

So the bool based partitioning would be a one time thing, correct?

If so, I'd say manually do that partitioning. Then use pg_partman to further partition each of those two partitions by the relevant time column.

@keithf4 keithf4 self-assigned this Jan 9, 2025
@nathanmascitelli
Copy link
Author

@keithf4 thanks for the quick response! Ya the bool partitioning is a one time thing.

I hadn't considered pointing pg_partman at the partitions directly. Any gotchas with doing that? One thing that jumps to mind is we would likely need to setup our own template table to make sure indexes and whatnot are the same everywhere.

@keithf4
Copy link
Collaborator

keithf4 commented Jan 9, 2025

I think it should work. Sub-partitioning in PG is just partitioning a child table the same way you did the parent. That's all that pg_partman is doing. You shouldn't need to use pg_partman's subpartitioning to be clear. Just run create_parent() on each of the child tables for the live and dead rows.
You can create your own template table and try having both of the live and dead table use that. It's a parameter you pass to create_parent().

Recommend testing all this before going live with it of course

@nathanmascitelli
Copy link
Author

@keithf4 yup understood, thank you again for the quick response and the idea. Feel free to close this issue, I will run with the ideas you've suggested.

@keithf4
Copy link
Collaborator

keithf4 commented Jan 10, 2025

I'll leave this open for now. I'd really be interested to know if this works successfully for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants