Unix socket instead of port for http/rest transport (and other performance suggestions)

paulocoghi · September 26, 2023, 11:50am

Hello Crate community. First of all, congratulations on CrateDB. Its unique.

I would like to know if it’s possible to use a unix socket path instead of a http port for the REST interface, because it is able to provide half of the latency and twice the performance, with zero changes both on server and client. [1][2]

Also, I cordially ask your opinion about how the byte-serialized POJO used internally on CrateDB would compare to the different data serialization strategies presented on the FlatBuffers benchmark test.

Because, in case it doesn’t fit on the “raw structs” strategy, it seems that CrateDB could benefit from using FlatBuffers, since it’s orders of magnitude faster than any other strategy.

FlatBuffers C++ library uses just 15kB, and there are implementations in 14 languages, including Java.

[1] blog.myhro.info/2017/01/how-fast-are-unix-domain-sockets
[2] redis .io/docs/management/optimization/benchmarks/

Since I’m a new user, I cannot create a topic with more than 2 links

amotl · October 11, 2023, 8:18pm

Dear Paulo,

thank you for writing in, welcome to the community, and apologies for the late reply.

Thank you for the kind words. ^[1]

UNIX sockets

I’ve consulted with the database team, and @Baur came back with this response:

ES issue for unix sockets and comment why they don’t want it.

Those two tickets provide further discussions about why Elasticsearch and OpenSearch do not support UNIX sockets. The same holds true for their sister CrateDB.

Support File Sockets rather than just Ethernet Interfaces · Issue #21377 · elastic/elasticsearch · GitHub
Add support for unix sockets · Issue #5531 · opensearch-project/OpenSearch · GitHub

FlatBuffers

This sounds interesting, but I don’t know if that or something similar has been considered by the database team, or if it would be up for consideration. Maybe @smu or @matriv are able to answer this?

With kind regards,
Andreas.

We are always happy to learn about where and how CrateDB is used, specifically by people who value its uniqueness. So, if you can share your use case or application, we will be all ears to hear about it. ↩︎

matriv · October 24, 2023, 7:58am

Flatbuffers sounds interesting! Up to now, we haven’t considered changing the serialization, I’ll bring it up for discussion with the team.

Thank you @paulocoghi!

matriv · October 31, 2023, 2:01pm

@paulocoghi First of all, thank you again for your suggestions!
After some discussion with the team, we decided not to invest time in the near future to investigate serialization/de-serialization improvements.

The reason is that so far we haven’t noticed a bottleneck in this area, which justifies a time investment to improve things. If you see, in the Flatbuffer benchmarks you’ve posted, CrateDB is more in the raw struct case, and from investigations on slow queries or inserts/updates, we see that the time spent for serialization/deserialization is in the order of microseconds, when the bottleneck in other areas, like the query execution engine, or Lucene, is in the order of seconds. We have thoughts though to optimize the content that we send around in certain cases, which can reduce both the time spent for serialization/deserialization and the network bandwidth required.

Topic		Replies	Views
Access crate via HTTP or via PostgreSQL CrateDB	1	1122	May 31, 2020
Why Crate Sql is slower than elasticsearch http API SQL	1	971	July 30, 2019
Performance Rest vs. Postgres interface CrateDB	1	884	March 24, 2021
Connection Refused error Infrequently CrateDB	27	2838	June 10, 2021
I can connect via crash remotely, but I can't access Admin UI via web browser, why? CrateDB	1	1381	October 4, 2021

Unix socket instead of port for http/rest transport (and other performance suggestions)

UNIX sockets

FlatBuffers

Related topics