Hi,
If we configure the path to be used for snapshot repositories, after running CREATE REPOSITORY we should use CREATE SNAPSHOT to take a backup of the desired tables instead of COPY TO. Another thing to check is that with the Docker compose file you shared I think each node may be seeing a separate /backup directory, but for snapshots to succeed all nodes should be able to reach the same location.
The problem is that I can’t even create the REPOSITORY.
COPY TO : I tried to run it to verify that CrateDB was able to write to a host directory.
Neither one nor the other worked.
So I end up with a CrateDB cluster made using docker that can’t access the host disk.
I confirm that by writing from the container in the mounted directory everything works and the same if I write from host to container. Everything OK at the docker level.
So I think the problem is with CrateDB.
Georg asked for the entire docker-compose.yml file and I posted it.
Do you think there is something misconfigured in the compose file?
CrateDB runs as user crate within the group crate within the container.
Can you confirm that the user or group crate can write to the volume from inside the container?
In host dir I find the file hello_by_crate_user.txt.
So, from container , as user crate, is possible to write and read file.
Same from the host.
— then
CONNECT OK
cr> create table t01 as select 1 a;
CREATE OK, 1 row affected (0.190 sec)
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_5_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_5_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_5_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr>
every node access to the its own host dir /backup, as shown in docker-compose.yml previously sent.
volumes:
- /data/cratedb_local/backup:/backup
So, correct if I’m wrong, in order to be able to back up or simply export a table , does the cluster need to be able to access a shared directory between hosts?
So, if I connect to node one of a CrateDB cluster made up of three nodes using ‘crash’ remotely I can’t copy the contents of the tables or even create repositories and snapshots for backup ?
For a local COPY TO it is enough that every node has access to a separate local disk. Every node will just copy the shards that reside on that node.
For a local CREATE REPOSITORY all nodes need access to the same directory.
If a cluster has multiple nodes, you must use a shared data storage volume mounted locally on all master nodes and data nodes.
So, if I connect to node one of a CrateDB cluster made up of three nodes using ‘crash’ remotely I can’t copy the contents of the tables or even create repositories and snapshots for backup
It does not matter at all to which node you connect to.
I ask you to confirm the following observations, so that we can proceed correctly
To be definitively sure, if I want to back up CrateDB that runs as a cluster consisting of x nodes, each host machine on which runs the container (that runs the instance of the CrateDB) node of the cluster must be equipped with a shared file system that allows to share the ‘/backup’ directory between all containers.
So to use the ‘fs’ option in ‘CREATE REPOSITORY’ I must have the destination directory shared among all hosts (for example by installing glusterfs or ceph and others fs on each node in the cluster).
The same is true if I want to do a ‘COPY TO’ with CrateDB in cluster mode.
Thank you for asking. Sharing filesystems between multiple containers sounds really dangerous to me.
I am not an expert on this matter, but I think creating fs-type repositories is not well suited for container use. Instead, I believe the recommended way to create repositories in advanced setups is to use remote object storage.
I have been able to discover the following options for that: