I’m in the process of looking to resize our table shards from 9 to a lower number.
I read in the documentation that this needs to be a factor ?
It is necessary to use a factor of the current number of primary shards as the target number of shards. For example, a table with 8 shards can be shrunk into 4, 2 or 1 primary shards.
So I was hoping to resize this downwards to 5 ( as we’ve downsized our CPUs from 9 to 5 ) for newer shards going forwards.
Would this be OK , or would another number like 6 be better ?
i.e. the shard factoring is more of a guide than a hard rule
Trying to alter a table from
number_of_shards 9 to 5 will results in
SQLParseException[Requested number of shards: <5> needs to be a factor of the current one: <9>]
Only 3 and 1 would be accepted in this case.
Take a look at How to decrease your number of shards in your cluster - Tutorials - CrateDB Community
for more details.
I was not sure if it was something that changed all the current shards or just from the next allocation.
I might be wrong in even attempting this as there is high expectation that the data will increase as well as the CPUs.
I don’t know if this is fully clear: for a partitioned table you can change the number of shards to an almost arbitrary value for new partitions with:
ALTER TABLE ONLY <table_name> SET ("number_of_shards" = 5);
This doesn’t have an effect on the existing shards / partitions.
Also the number of nodes in your cluster should have a higher priority in the right amount of shards, then the amount of allocated cpus, to equally distribute the data across the cluster and therfore also get optimal performance.
i.e. with 3 nodes →
number_of_shards should most likely should be a multiple of 3 (i.e. 3,6,9,12, …)