Unable to create a repository (CrateDB cluster running with docker containers) [SOLVED]

Hello,

I created a shared directory trough the host filesystem and the docker container.

The local directory is mounted in container dir /backup (using docker compose)

volumes:
   - /data/cratedb_local/backup:/backup

And I entered in

command: ["crate", ... , ..., "-Cpath.repo=/backup", ...]

using crash (remotely) I connect to the node and run:

cr> CREATE REPOSITORY myrepo TYPE fs
    WITH (location='/backup');

But I get the error :

RepositoryException[[myrepo] location [/backup] doesn't match any of the locations specified by path.repo because this setting is empty]

Where am I wrong?

Thx.

I just tried to replicate this, but it works as expected.
Are you using multiple nodes? Did you recreate the nodes after adjusting this?

Hello,

Yes, I’m using multiple nodes. Every node runs on a different host. The cluster is running and it seems to work as expected.

I’ve changed the docker compose file inserting -Cpath.repo=/backup then docker compose stop , docker compose down and docker compose up -d .

I don’t know if it can be useful to you :
the dir /config/crate.yml is external to the container and loaded as

volumes:
  - ./config/crate.yml:/crate/config/crate.yml

and it is in the same dir where there is docker-compose.yml.

I tried to enter the path here too, but the result seems to be the same.

Thx.

The error is pointing to the path.repo not being applied. Why this is the case is a wild guess without the complete docker_compose.yml

RepositoryException[[myrepo] location [/backup] doesn't match any of the locations specified by path.repo because this setting is empty]

Hello, here is the compose file.

services:
  crate:
    image: crate:5.3
    container_name: node-2
    environment:
      - CRATE_HEAP_SIZE=4g
    volumes:
      - cratedb-vol:/data
      - ./config/crate.yml:/crate/config/crate.yml
      - /data/cratedb_local/backup:/backup

    command: ["crate",
     "-Ccluster.name=testcluster",
     "-Cnode.name=crate2",
     "-Cdiscovery.seed_hosts=10.10.1.49,10.10.1.52",
     "-Ccluster.initial_master_nodes=10.10.1.49,10.10.1.50,10.10.1.52",
     "-Cgateway.expected_data_nodes=3",
     "-Cgateway.recover_after_data_nodes=2",
     "-Cauth.host_based.enabled=true",
     "-Cpath.repo=/backup",
     "-Cauth.host_based.config.0.method=trust",
     "-Cauth.host_based.config.0.address=_local_",
     "-Cauth.host_based.config.0.user=crate",
     "-Cauth.host_based.config.99.method=password"]

    restart: always
    network_mode: "host"

volumes:
  cratedb-vol:
    external: true

UPDATE :
I’ve set permission on host directory to 777 (chmod 777 backup).

Crash connected from a remote terminal is unable to perform a copy.

copy Doc.tst  to directory '/backup/' ;

returns

NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/tst_1_.json (No such file or directory)']

Hi Georg,

Do you have any idea why these CrateDB errors with the docker compose I’m using ?

Is this a Docker or CrateDB configuration issue?

Thank you for your support.

Hi,
If we configure the path to be used for snapshot repositories, after running CREATE REPOSITORY we should use CREATE SNAPSHOT to take a backup of the desired tables instead of COPY TO. Another thing to check is that with the Docker compose file you shared I think each node may be seeing a separate /backup directory, but for snapshots to succeed all nodes should be able to reach the same location.

Hello Hernan,

The problem is that I can’t even create the REPOSITORY.

COPY TO : I tried to run it to verify that CrateDB was able to write to a host directory.

Neither one nor the other worked.

So I end up with a CrateDB cluster made using docker that can’t access the host disk.

I confirm that by writing from the container in the mounted directory everything works and the same if I write from host to container. Everything OK at the docker level.

So I think the problem is with CrateDB.

Georg asked for the entire docker-compose.yml file and I posted it.

Do you think there is something misconfigured in the compose file?

Thank you for your support.

So I think the problem is with CrateDB.

CrateDB runs as user crate within the group crate within the container.
Can you confirm that the user or group crate can write to the volume from inside the container?


version: '3.8'
services:
  cratedb:
    image: crate:latest
    ports:
      - "4200:4200"
    volumes:
      - /tmp/crate/01:/data
      - /Users/georg/Sandbox/docker/repo:/repo
    command: ["crate",
              "-Cnode.name=cratedb01",
              "-Cauth.host_based.enabled=true",
              "-Cauth.host_based.config.0.method=trust",
              "-Cauth.host_based.config.0.user=crate",
              "-Cauth.host_based.config.99.method=password",
              "-Cpath.repo=/repo"
             ]
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
    environment:
      - CRATE_HEAP_SIZE=2g
CONNECT OK
cr> CREATE TABLE t01 AS SELECT 1 a;
CREATE OK, 1 row affected (0.274 sec)

cr> COPY t01 TO DIRECTORY ('/repo');
COPY OK, 1 row affected (0.065 sec)

Hello Georg,

accessed docker container, then su crate.

[crate@test backup]$ touch hello_by_crate_user.txt

In host dir I find the file hello_by_crate_user.txt.

So, from container , as user crate, is possible to write and read file.

Same from the host.

— then

CONNECT OK
cr> create table t01 as select 1 a;
CREATE OK, 1 row affected (0.190 sec)
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_5_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_5_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_5_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr> copy t01 to directory ('/backup');
NotSerializableExceptionWrapper[unhandled_server_exception: Failed to open output: '/backup/t01_1_.json (No such file or directory)']
cr>

files created by crateDB in /backup dir :

ls backup (from host)

-rw-r--r-- 1 crate crate    0 Sep 15 17:46 t01_1_.json
-rw-r--r-- 1 crate crate    8 Sep 15 17:46 t01_2_.json
-rw-r--r-- 1 crate crate    0 Sep 15 17:46 t01_3_.json
-rw-r--r-- 1 crate crate    0 Sep 15 17:46 t01_4_.json

Only t01_2_.json has data inside {"a":1}

How many nodes does this cluster have?
Do all nodes have access to that folder?

SELECT count(*) from sys.nodes

Hi,

Cluster has 3 nodes.

No, there is a ‘/backup’ folder on each node but is local folder, not shared between hosts.

If you see some exported json on one host, I suspect that you did not set up the volume correctly on all nodes.

Hello,

every node access to the its own host dir /backup, as shown in docker-compose.yml previously sent.

volumes:
  - /data/cratedb_local/backup:/backup

So, correct if I’m wrong, in order to be able to back up or simply export a table , does the cluster need to be able to access a shared directory between hosts?

So, if I connect to node one of a CrateDB cluster made up of three nodes using ‘crash’ remotely I can’t copy the contents of the tables or even create repositories and snapshots for backup ?

Thank you.

For a local COPY TO it is enough that every node has access to a separate local disk. Every node will just copy the shards that reside on that node.
For a local CREATE REPOSITORY all nodes need access to the same directory.

If a cluster has multiple nodes, you must use a shared data storage volume mounted locally on all master nodes and data nodes.

So, if I connect to node one of a CrateDB cluster made up of three nodes using ‘crash’ remotely I can’t copy the contents of the tables or even create repositories and snapshots for backup

It does not matter at all to which node you connect to.

Hi, some more concerns.

I ask you to confirm the following observations, so that we can proceed correctly :slight_smile:

  1. To be definitively sure, if I want to back up CrateDB that runs as a cluster consisting of x nodes, each host machine on which runs the container (that runs the instance of the CrateDB) node of the cluster must be equipped with a shared file system that allows to share the ‘/backup’ directory between all containers.

  2. So to use the ‘fs’ option in ‘CREATE REPOSITORY’ I must have the destination directory shared among all hosts (for example by installing glusterfs or ceph and others fs on each node in the cluster).

  3. The same is true if I want to do a ‘COPY TO’ with CrateDB in cluster mode.

Thank you.

Dear Luca,

Thank you for asking. Sharing filesystems between multiple containers sounds really dangerous to me.

I am not an expert on this matter, but I think creating fs-type repositories is not well suited for container use. Instead, I believe the recommended way to create repositories in advanced setups is to use remote object storage.

I have been able to discover the following options for that:

As I see you are running CrateDB on your own premises, I think the third option would be the best fit for your setup.

With kind regards,
Andreas.

Dear Andreas,

Thanks for your suggestion.
Installed MinIO.
Now COPY TO, CREATE REPOSITORY and CREATE SNAPSHOT are possible in every node of the cluster…

At this point using the ‘fs’ (filesystem) option only makes sense for CrateDB that runs as a single instance and not in clusters.

Luca

1 Like