Poor performance with five tables using join

Jun_Zhou · November 29, 2023, 8:44am

We’re using CrateDB 5.5 with one node, and we’re testing a below sql, while it takes about than 75 seconds. Did I configure something wrong？Any suggestion will be appreciated.

Server info:

//CPU
processor       : 79
model name      : Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz

//Memory
$ free -g
              total        used        free      shared  buff/cache   available
Mem:            755          92          61           1         600         658
Swap:            15           0          15

Table size:

SQL:

select count(1)
    from pgdb_scm.inv_Inv_Post_Detail pd 
    join pgdb_scm.inv_inv_post_line pl on pl.post_line_id = pd.post_line_id
    join pgdb_scm.inv_inv_shipping_dlvy_l dl on pl.dlvy_line_id = dl.line_id
    join pgdb_scm.inv_inv_shipping_notice_l nl on nl.line_id = dl.shipping_line_id
    join pgdb_scm.inv_inv_shipping_notice_h nh on nh.header_id = nl.header_id;

sql_plan
crate.yml

hernanc · November 29, 2023, 8:50am

Hi,
To best review this we would need to see the table definition of all these tables.
But one factor here is probably also going to be the high number of shards in relation to the size of the tables, please take a look at Sharding and partitioning guide for time-series data - Tutorials - CrateDB Community

Jun_Zhou · November 29, 2023, 9:33am

Hi @hernanc, I changed the number of shards that the five tables to 4, while the query still slow. Please see the attachment included all table’s definition.

table definition

hernanc · November 29, 2023, 9:55am

Hi, access to the file seems restricted. If you do not feel comfortable discussing the details publicly here in the community maybe you would like to schedule a call with a CrateDB Customer Engineer?

Jun_Zhou · November 29, 2023, 10:13am

I’m sorry I forget to grant the permission to access the definition file. And now you can download it. I don’t feel uncomfortable discussing issues in the community, and I will provide you any details you need.

Jun_Zhou · November 30, 2023, 7:38am

@hernanc Hi, did you have any suggestion on this issue?

Jun_Zhou · December 1, 2023, 3:24am

I reupload related files.
table_ddl.sql (7.5 KB)
crate.yml (25.5 KB)
sql_plan.json (167.0 KB)

hernanc · December 19, 2023, 4:17pm

Hi,
Is this data coming from some other system where FK constraints are enforced?
Are the rows created in a certain order?
If so perhaps the count that you are looking for is actually the number of distinct shipping_line_id values (excluding null)?
If instead the result set that you are actually looking for is not a count but some aggregation then we could maybe write the query in a different way, happy to look at this if you want to share further details.
Thank you.

Topic		Replies	Views
Group by query with join taking forever SQL	14	1080	December 21, 2021
Limited performance during query CrateDB	15	1176	May 26, 2021
Partitioning strategy CrateDB fundamentals	3	58	February 6, 2025
SELECT with Join takes forever SQL	4	649	November 15, 2021
Table creation considerations CrateDB	7	551	July 14, 2022

Poor performance with five tables using join

Related topics