Separating node to node communication

Is there a way to communicate between nodes from different interface. I am planning to separate application to DB communication from DB to DB communication to speed up replication since we are storing wast amounts of data inside cratedb but I could not find any related topic to that. Or like is there a way to increase performance of internal business. I see shard replication and other things can happen pretty slow if data is huge. And one port design can be problem if virtualization is used because any virtual machine in the same host of cratedb will use the same bandwidth and bandwidth can be saturated. I have revised specs and found out I can do somethin like that but I am not sure if this is going to work or if it works I will benefit. 192 network is application traffic 10 network node to node communication.

network.host: [192.168.1.11, 10.0.0.11]

http.port: 4200

transport.tcp.port: 4300

discovery.seed_hosts:

- 10.0.0.11:4300

  • 10.0.0.12:4300

- 10.0.0.13:4300

Hi there,

There is currently no way of modifying how the nodes communicate without heavily deviating/modifying the intended implementation.

There are many ways of tweaking the database to improve your use case, but first it’d be helpful to understand your use case, how much data is ‘vasts amounts of data’?

This could be the case, but re-implementing the communication from node to node might not be the realistic solution, have you already been hitting performance issues with replication?

1 Like

Hello surister thank you for replying. My use case is I have two table 1tb each. these tables are copy of each other, second table can have multiple copies of the first table rows. we are denormalizing data in order to have faster query speeds for business. Each table partitioned into months. And each partition has 4 shards replicated that has 1 primary and 1 replica. each shard almost 27 gb. We query in to partitioned data and write the data into second table. My issue is if there is a performance issue, some times I have to restart one of the problematic nodes. Some times shards get ready very fast but sometimes some of the shards become red and I wait 4 - 5 hours to recover the data state. Like normally queries takes milisecond to finish both insert and select. but if table shards in recovering state then It is hell of a performance bottle-neck insert queries takes too long. I do not know how to approach this. Plugging another 10g interface and increasing shard replication parallelism and max bytes was my idea to finish this bottleneck as soon as possible. But as you told it is not doable and maybe not that clever I am just trying to do brain storming to be honest :smiley:

Hi @Yusuf_Cansiz,

replication is (potentially) throttled by indices.recovery.max_bytes_per_sec (https://cratedb.com/docs/crate/reference/en/latest/config/cluster.html#recovery) which defaults to 40 MB/s for each index (each partition is its own index).

If you are ingesting at very significant volumes and replication cannot catch up, it may be throttled by that parameter and increasing it can help.

Beyond that, what type of disks are you using? Especially on cloud setups one can easily run out of provisioned IOPS or throughput which then has a negative effect on replication and overall performance.

And what CrateDB version are you running?

Best
Niklas

1 Like

My cluster version is 5.10.4. I am running them on NVME disks on 1 G network connected to each other but this 1G line is also used by other vms on my promox hosts. Other VMS is for the same application stack nothing different. I was planning to increase that parameter actually after increasing my port specs because 1G can be saturated pretty easily with heavy max_bytes_per_sec. Like is there a way to keep up insert performance in case of such replication period ? Or what I am having maybe a rare occasion not sure if insert queries is affected during recovery in all cases. I do not really care about recovering data in 4 or 5 hours if I can keep up inserting since this is the core functionality of our app. If you have like suggestions for this I am very open to them.

Is there a way to communicate between nodes from different interface

Yes it is possible to configure a CrateDB node to use a different interface for it’s transport protocol which is used for internal node-to-node communication. Unfortunately, we missed out to document the related config option transport.host which is required to adjust to achieve it. By default, transport.host will use the value of network.host.

Example:

network.host:    _eth0_
transport.host:  _eth1_

By this, all node-to-node communication will go over the eth1 network interface, while any client communication (HTTP/4200 and PG/5432) is only possible over the eth0 interface.

See https://cratedb.com/docs/crate/reference/en/latest/config/node.html#hosts for all allowed values for the *.host settings.

1 Like

Related to the slow replica recovery issue, please increase indices.recovery.max_bytes_per_sec first like @hammerhead mentioned, as the default value of 40MB/s is very conservative to account for minimal setups but is mostly way too low on productive setups.

1 Like

Hello smu thank you for your replies, if I use separate interfaces do you think this is going to increase the speed of the cluster. this node to node interface will be dedicated to only cratedb nodes. It will be high speed 10G connection. As I told earlier I am also going to change indices.recovery.max_bytes_per_sec for the cluster. Last additional question, in case of any recoveries what kind of affect this config has node_concurrent_recoveries combined with indices.recovery.max_bytes_per_sec.

if I use separate interfaces do you think this is going to increase the speed of the cluster. this node to node interface will be dedicated to only cratedb nodes.

I’d guess so, but depends all on you setup/use-case.

Last additional question, in case of any recoveries what kind of affect this config has node_concurrent_recoveries combined with indices.recovery.max_bytes_per_sec.

Higher CPU and IO load, but again, depends all on you setup/use-case/work-load.

1 Like