I already mentioned this in another Topic, then thougth it is solved, but still the issue is emminent and I tried to better circumscribe on this.
I have a table with the following scheme:
CREATE TABLE doc.odp_ts ( entity_id text, -- primary key, entity_type text, time_index timestamp with time zone, source text, value text, value_num double precision, category text, location_index text, location geo_shape, location_centroid geo_point, parent text, content object(ignored), year integer GENERATED ALWAYS AS extract(year from coalesce(time_index,CURRENT_TIMESTAMP)) ) CLUSTERED INTO 10 SHARDS PARTITIONED BY (year) ;
location there is a polygon representation of a UBER H3 Hexagon, so always only 7 vertices.
content there is a JSON representation of the data of this row. Its a little bit more comprehensive, but EVERY (!) rows json is nearly the same size. There isn’t much variation in the size of these JSONs
source there is either ‘aaa’ or ‘bbb’…you guess it? It tells how the rows were imported:
- The ‘aaa’ were imported via
COPY FROM...and then through some ELT Queries cleaned and INSERTED into this table
- The ‘bbb’ were INSERTED directly via the HTTP Endpoint`s bulk_copy
If I delete all ‘bbb’ rows and keep only the roughly 8.8 Mio. ‘aaa’ rows, the table has a size of 1.5GB
If I delete all ‘aaa’ rows and keep only roughly 9 Mio. ‘bbb’ rows, the table has a size between 30 and 50 GB (I got different outcomes in my experiements).
These sizes had a long time to sit in, so I doubt there are any table optimizations, merges and so on left to bring that further down. The problem is not only size, but also Query time if the result set intersects on any of these heavy-weight rows.
I am assuming that this has something to do with how the index was built, as the index would be the only thing I could’t compare so far (the rows totally hold compareable data) and this would also explain why queries are significantly slower on these rows.
Is there any possibility to check/repair/recreate the index? Just copying the rows to a new table didn’t work so far.