Hello, I’ve read Compare CrateDB | CrateDB vs TimescaleDB | Crate.io and Comparing databases for an Industrial IoT use-case: MongoDB, TimescaleDB, InfluxDB and CrateDB - CrateDB but I’m still a little confused about the compression performance and characteristics of CrateDB as compared to TimescaleDB.
We’re evaluating our on-premise 3-node (multi-node) TimescaleDB cluster for our time series use cases, and for our data sets (100s millions of rows, a few months of data, storing many sensor values per device per second), after enabling compression policy in TimescaleDB, we’ve seen 7x to 10x compression, in other words a remarkable disk space saving! As far as I can understand, TimescaleDB achieves this by chunking a user-defined window of data into buckets, then running appropriate compression algorithms based on the data type of the columns.
Is there some more detailed information regarding the compression performance of CrateDB? Is it possible to have CrateDB automatically compress data older than some user-defined period?
And if the answer yes, then are there things that developers should be careful about with respect to inserting time series data into older data ranges? (E.g. because the data is compressed, it might require more work to insert, update, etc.)