Hello all, me again
OK at the moment I’ve got partitioning structure where a the month number is stored in the tables and this the partitioning column.
CREATE TABLE IF NOT EXISTS "doc"."v3_safe_1526595589" (
"ts" TIMESTAMP WITH TIME ZONE,
...
...
"dmax3" REAL,
"id" VARCHAR(40) NOT NULL,
"roundts" TIMESTAMP WITH TIME ZONE NOT NULL,
"roundts_month" TIMESTAMP WITH TIME ZONE GENERATED ALWAYS AS date_trunc('month', "roundts") NOT NULL,
PRIMARY KEY ("roundts", "roundts_month", "id")
)
CLUSTERED BY ("id") INTO 3 SHARDS
PARTITIONED BY ("roundts_month")
WITH (
"allocation.max_retries" = 5,
"blocks.metadata" = false,
"blocks.read" = false,
"blocks.read_only" = false,
"blocks.read_only_allow_delete" = false,
"blocks.write" = false,
codec = 'best_compression',
column_policy = 'dynamic',
"mapping.total_fields.limit" = 1000,
max_ngram_diff = 1,
max_shingle_diff = 3,
number_of_replicas = '0-1',
"routing.allocation.enable" = 'all',
"routing.allocation.total_shards_per_node" = -1,
"store.type" = 'fs',
"translog.durability" = 'REQUEST',
"translog.flush_threshold_size" = 536870912,
"translog.sync_interval" = 5000,
"unassigned.node_left.delayed_timeout" = 60000,
"write.wait_for_active_shards" = '1'
)
The only problem is the data is projected to be at least 1.4TB a month in size and this will steadily increase.
I’m looking to break this down more and so was wondering if there is the option of bi-weekly, as I think week number could cause more problems ( 3 shards per table ) .
I know the interval for date_trunc is limited to “weekly”, but is there some sorcery I can use to do bi-weekly ?
Of course, if I’m being ridiculous, please suggest away…
Many thanks
David.