Why Crate Sql is slower than elasticsearch http API

sunny · July 3, 2019, 7:09am

I am testing the crateDB performance.
There are 78164052 docs in CrateDB on 1 node server.

select “year”,count(hotel_code) as count from f_and_b group by “year” limit 100;
will take SELECT OK, 18 rows in set (6.020 sec)

but when I query same data from es api in the same crateDB:

{
“size”: 0,
“aggs”: {
“group_by_year”: {
“terms”: {
“field”: “year”
}
}
}
}
will response
Jietu20190703-150800

smu · July 30, 2019, 8:29am

CrateDB uses its very own execution engine so the requests will be executed completely different. In order to make that more clear and avoid confusions (CrateDB is not on top of ES), the ES API shipped with CrateDB was deprecated with v3.1.0 and removed by v4.0.0.
Specially on how to execute aggregations, ES is very different from what CrateDB does.

Which makes the main difference in your case it that the CrateDB SQL is:

doing a additional aggregation on the hotel_code field while the ES request isn’t
is returning all rows (limit 100) while the ES request is returning 0 (size: 0).
it will aggregate all found year terms while ES’s terms aggregation will return only the top ten terms ordered by doc_count.

Beside of that, CrateDB’s execution engine it optimized for distributed execution, so adding nodes will increase performance a lot (which may not completely the same for ES aggregations).

Topic		Replies	Views
Similar elasticsearch commands CrateDB sql	3	392	April 20, 2023
PostgreSQL time-related queries faster than the same queries in CrateDB SQL	1	901	December 1, 2021
Can underlying elasticsearch be accessed directly for reading only Configuration	3	463	October 18, 2022
Access crate via HTTP or via PostgreSQL CrateDB	1	1121	May 31, 2020
Client interface to use for faster data ingestion CrateDB	1	661	May 13, 2020

Why Crate Sql is slower than elasticsearch http API

Related topics