SQL Freeze after sync done

ilbee · October 18, 2021, 11:33am

Hello,

We have a problem with crateDB in production environment.

We have 3 nodes in the cluster on 3 different servers. All 3 nodes are eligible to become master.
Each server is running on linux (redhat 8.3) and has 60GB of memory. But the service is started with 30GB of heap size, with the following command:
ulimit -u 4096 && CRATE_HEAP_SIZE=‘30g’ && CRATE_JAVA_OPTS=’-Xms30g -Xmx30g’ /apps/crate-4.6.4/bin/crate

We are currently using version 4.6.4 of Crate.
The problem occurs since we switched to version 4.3.

When we start the cluster, SQL via HTTP is OK.
As soon as the synchronization reaches 100% the SQL is no longer accessible.

We have activated the debug level on the logs.
We see some activity, but the HTTP service is not responding.
No error appears.

The configuration is quite simple:

cluster.name: clustertername
node.name: "node1

path.data: /data/crate/clustername/

gateway.expected_nodes: 3
gateway.recover_after_nodes: 2

network.host: 10.135.x.y
network.bind_host: "dns_alias_for_10.135.x.y"

node.master: true
discovery.seed_hosts:
    - "10.135.x.x"
    - "10.135.x.y"
    - "10.135.x.z"
cluster.initial_master_nodes:
    - "10.135.x.x"
    - "10.135.x.y"
    - "10.135.x.z"

Would you have an idea to help us in solving this problem please?
In the meantime, so that the nodes are always up, we delete the /data/crate/clustername/ directory to force a synchronization…
This just keeps the HTTP SQL active.

Thanks for your help!

proddata · October 19, 2021, 6:34am

By SQL via HTTP do you mean the Admin UI or the http-interface (i.e. /_sql endpoint)?

ilbee · October 19, 2021, 8:24am

Hello, I mean both

The administration interface is not responding, and the http interface is not responding either.

We tried to create a new cluster last night with copy to / copy from tables.
With one node, it’s ok.
When we start a second one and the synchronization is done… No more answer.

Maybe it is related to a table parameter?
We are back to the existing show create table, with the iso parameters.

Thank you!

proddata · October 20, 2021, 6:54am

With one node, it’s ok.
When we start a second one and the synchronization is done… No more answer.

Do you start one node, write data to it and then add a 2nd one?
Do you still see shards being moved from one node to another?

Topic		Replies	Views
Cluster does not respond after synchronization is complete Community sql	1	118	April 3, 2024
3rd node of 3 Node cluster always syncing complete data CrateDB	6	461	October 18, 2022
I'm trying to set up a 3-node CrateDB cluster and now I can't even connect with crash CrateDB	8	1529	October 4, 2021
CrateDB 5.4: Write thread pool getting stuck after upgrade from 5.3.2 CrateDB	4	511	October 5, 2023
2 node cluster recurrent issue with request timing out CrateDB	4	68	April 14, 2025

SQL Freeze after sync done

Related topics