I already mentioned this in another Topic, then thougth it is solved, but still the issue is emminent and I tried to better circumscribe on this.
I have a table with the following scheme:
CREATE TABLE doc.odp_ts (
entity_id text, -- primary key,
entity_type text,
time_index timestamp with time zone,
source text,
value text,
value_num double precision,
category text,
location_index text,
location geo_shape,
location_centroid geo_point,
parent text,
content object(ignored),
year integer GENERATED ALWAYS AS extract(year from coalesce(time_index,CURRENT_TIMESTAMP))
)
CLUSTERED INTO 10 SHARDS
PARTITIONED BY (year)
;
In location
there is a polygon representation of a UBER H3 Hexagon, so always only 7 vertices.
In content
there is a JSON representation of the data of this row. Its a little bit more comprehensive, but EVERY (!) rows json is nearly the same size. There isnât much variation in the size of these JSONs
In source
there is either âaaaâ or âbbbââŚyou guess it? It tells how the rows were imported:
- The âaaaâ were imported via
COPY FROM...
and then through some ELT Queries cleaned and INSERTED into this table - The âbbbâ were INSERTED directly via the HTTP Endpoint`s bulk_copy
If I delete all âbbbâ rows and keep only the roughly 8.8 Mio. âaaaâ rows, the table has a size of 1.5GB
If I delete all âaaaâ rows and keep only roughly 9 Mio. âbbbâ rows, the table has a size between 30 and 50 GB (I got different outcomes in my experiements).
These sizes had a long time to sit in, so I doubt there are any table optimizations, merges and so on left to bring that further down. The problem is not only size, but also Query time if the result set intersects on any of these heavy-weight rows.
I am assuming that this has something to do with how the index was built, as the index would be the only thing I couldât compare so far (the rows totally hold compareable data) and this would also explain why queries are significantly slower on these rows.
Is there any possibility to check/repair/recreate the index? Just copying the rows to a new table didnât work so far.