Evaluating storage space reduction in size on disk with Crate 5.10

Hi,

This is regarding storage space reduction as documented for version 5.10. It is mentioned that with new table storage format, we can see around 50% storage space reduction in size on disk

We created sample table as mentioned below & inserted around 10 million of data through our test java application.

create table uu.records (
id long,
uid string,
recordid long,
  rday TIMESTAMP WITH TIME ZONE,
recordname string,
recorddesc string,
primary key (rday,recordid,id )
) PARTITIONED BY (rday)
WITH (
   "translog.durability" = 'ASYNC'
);

But we could not see any major reduction is Size (Sum of primary shards) or partition wise size as visible in crate UI console or even partition wise .

Do we need to set some configuration flags or any other setting (e.g. related to _source/_doc) to get the benefit of disk size reduction.

Appreciate any help/suggestions.

Thanks & regards,
Amod

Hi @Amod,

The recovery source is only removed—and the resulting ~50% storage savings realized—under specific conditions, such as when segments are merged and retention leases are updated (typically every 5 minutes). If you continuously ingest and query data, you should see these savings occur automatically in many cases. However, in artificial tests the process may not work as expected; the table can become idle, leading to fewer refreshes, larger segments, and therefore fewer segment merges when data is flushed to disk. We acknowledge that this situation is not optimal from a UX perspective.

You can force this changes with an OPTIMIZE call with a force merge e.g.

OPTIMIZE TABLE uu.records WITH (max_num_segments = 1);
2 Likes

Thanks @proddata for the reply.

With our test program, we kept on continuously ingesting data to our test table. Still we couldn’t see any disk space reduction.

Only after executing OPTIMIZE command, we could see storage space reduction.

Is there something else we need to do apart from ingesting data to make sure segment gets merged & thereby reducing disk space.

Thanks & regards,
Amod

Is the test program only writing data, or is it also continuously reading data during ingestion? I noticed that your table does not have an explicit refresh interval set. This means that if there are no active search queries (e.g., SELECT statements), CrateDB will consider the table idle. In an idle state, fewer but larger segments are flushed to disk, and larger segments are less likely to be merged.

For the recovery source to be removed, three conditions must be met:

  1. Data segments must be flushed to disk. (this only happens after the translog.flush_threshold has been reached - which defaults to 500MB / Shard - or another condition leads to a flush)
  2. Retention leases must progress for the written data (typically with a ~5-minute delay from initial ingestion).
  3. Previously flushed segments must be merged (This typically only happens when roughly 10 segments of similiar size are available to be merged into one bigger one.

If you write e.g. 10 GiB of data across 4 shard, and don’t query the data during ingestions, then you might end up with 20 segments (5 per shard) with 500 MBs each. Those unlikely will get merged automatically unless more new data is written to the table (shards)

Thanks @proddata for the reply.

We kept on ingesting data & also simultaneously started a thread which read same data repeatedly. Still we couldn’t get disk space reduction.

1. Is that shard size should get up to 500 MB for segments to be merged leading to storage space reduction?
2. We have observed that if we execute command "OPTIMIZE TABLE table_name WITH (max_num_segments = 1)"on version 5.9.9, then also space reduction occurs. So I was wondering what specific improvement has been done in 5.10.1. I am assuming with 5.10.1, space reduction is getting done automatically. Is this assumption correct?
3. For reducing space for specific partition, we executed following comamnd OPTIMIZE TABLE uu.records PARTITION ( rday = xxx ); But we couldn’t get space reduction for a specific partition. Again does it depend on shard size.
Update: We did executing above command as follows & it did work:
OPTIMIZE TABLE uu.records PARTITION ( rday = xxx ) WITH (max_num_segments = 1);
So ignore this question.

Thanks & regards,
Amod