Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding compression by default, or mention it in README #214

Open
hnykda opened this issue Oct 14, 2024 · 0 comments
Open

Consider adding compression by default, or mention it in README #214

hnykda opened this issue Oct 14, 2024 · 0 comments

Comments

@hnykda
Copy link

hnykda commented Oct 14, 2024

Hello,

I know there have been some discussions about related topics, but I would propose turning the compression, or at least educate the users. It took me about three hours to find out that simply doing:

ALTER TABLE ltss SET (
  timescaledb.compress,
  timescaledb.compress_orderby = 'time DESC',
  timescaledb.compress_segmentby = 'entity_id'
);


SELECT add_compression_policy('ltss', INTERVAL '30 days');

SELECT SELECT
  *
FROM timescaledb_information.hypertables
WHERE hypertable_name = 'ltss';

-- now it takes time based on the amount of data to have to chunk it
-- it took couple of minutes for 30 GB of data
-- it seems you don't need to run compress_chunk manually on old chunks
-- as I thought

-- this simple thing achieved incredible compression of 98% at least based on:

SELECT 
    chunk_schema,
    chunk_name,
    compression_status,
    pg_size_pretty(before_compression_total_bytes) as before_size,
    pg_size_pretty(after_compression_total_bytes) as after_size,
    CASE 
        WHEN compression_status = 'Compressed' AND before_compression_total_bytes > 0 THEN
            (100 * (1 - after_compression_total_bytes::float / before_compression_total_bytes::float), 2)
        ELSE
            NULL
    END as compression_ratio_percent,
    before_compression_total_bytes,
    after_compression_total_bytes
FROM chunk_compression_stats('ltss')
ORDER BY chunk_name;

achieves a crazy compression ratio of something like 98% (is it even real? am I doing some dumb error here?), bringing my 30 GB to a fraction of that without any noticeable effects for my usecase. It also doesn't significantly change the table in anyway (like additional columns), or creates new tables, views, triggers...

I totally understand this is very usecase dependent, but I would argue for many something like this would be a better and more maintable solution than no compression 🤷 . I do not hold this view strongly though, and can imagine that this tool is being used by pros who can do this anyway, and for us noobs it's to dangerous.

Totally understand if you don't want to do that. But at least I am bringing this to the front in case there are more like me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant