Just for a general reference oversharding / overpartitioning is considered bad practice. Typically a single shard can easily hold between 10 to 50 GiBs of data without any significant performance impacts (also see Sharding and Partitioning Guide for Time Series Data)
ā a partition of 6 shards is fine to hold ~300 GiBs of data.
Also specifying CLUSTERED BY(hod)
probably prevents data to be probably distributed across shards. If less shards are used, probably also the performance is worse.
I would suggest to do the following:
CLUSTERED INTO 10 SHARDS PARTITIONED BY (year)
10 shards, so that each node ideally gets assigned 2 and partition only by year