Optimizing storage for historic time-series data

Hi @vinayak.shukre,

You can run ALTER TABLE table PARTITION (partition_key = partition_value) SET (codec = 'best_compression'); on an already existing partition. However, you will need to close the partition before doing that. While a partition is closed, its rows won’t be included in the output of any SELECT queries:

Example:

ALTER TABLE table PARTITION (ts_month = 1648771200000) CLOSE;
ALTER TABLE table PARTITION (ts_month = 1648771200000) SET (codec = 'best_compression');
ALTER TABLE table PARTITION (ts_month = 1648771200000) OPEN;

Whether time_bucket in the examples above qualifies as a replacement for partitioning depends on the data volume. As per Sharding and Partitioning Guide for Time Series Data, if the size of a shard exceeds 70 GB, it is recommended to use partitioning. Therefore, it can still make sense to use partitioning to avoid degrading index performance. For tables where performance plays a less critical role, you might choose to go for a higher shard size, like 100 GB instead of 70 GB.