Panic: restarting the kubernetes cluster loses all blobs but retains the doc tables

MirtoBusico · December 30, 2024, 9:30am

Hi all,

in my home-lab I’m using CrateDB installed in kubernetes using the crate-operator.

I created a blob table and uploaded three files:

sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ crash --verbose --hosts "192.168.151.21" -U system
+----------------------------+------------+---------+-----------+---------+
| server_url                 | node_name  | version | connected | message |
+----------------------------+------------+---------+-----------+---------+
| http://192.168.151.21:4200 | data-hot-1 | 5.8.1   | TRUE      | OK      |
+----------------------------+------------+---------+-----------+---------+
CONNECT OK
CLUSTER CHECK OK
TYPES OF NODE CHECK OK
cr> select * from blob.attachments;
+--------+---------------+
| digest | last_modified |
+--------+---------------+
+--------+---------------+
SELECT 0 rows in set (0.021 sec)
cr> select * from blob.attachments;
+------------------------------------------+---------------+
| digest                                   | last_modified |
+------------------------------------------+---------------+
| acfa51b963b5a1fb9eecaf7954ae39c9bea736cc | 1735549075725 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735549089520 |
| 135b6d193c9acba8ea180dea7424b863ce52858b | 1735549054635 |
+------------------------------------------+---------------+
SELECT 3 rows in set (0.016 sec)
cr> \q
Bye!

Then I stopped the kubernetes cluster (it is a home-lab; every night it is powered off and every morning it is powered on)

With my surprise when I started the cluster this morning all blobs were gone.

To be sure I uploaded again the three files, stopped the cluster and started again: all blobs are gone:

sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ crash --verbose --hosts "192.168.151.21" -U system
+----------------------------+------------+---------+-----------+---------+
| server_url                 | node_name  | version | connected | message |
+----------------------------+------------+---------+-----------+---------+
| http://192.168.151.21:4200 | data-hot-1 | 5.8.1   | TRUE      | OK      |
+----------------------------+------------+---------+-----------+---------+
CONNECT OK
CLUSTER CHECK OK
TYPES OF NODE CHECK OK
cr> select * from blob.attachments;
+--------+---------------+
| digest | last_modified |
+--------+---------------+
+--------+---------------+
SELECT 0 rows in set (0.019 sec)
cr>

What is happening?

hernanc · December 30, 2024, 9:46am

Hi,
This would be related with the reclaimPolicy of the storageClass which you specified in the definition of your CrateDB resource, it is probably set to Delete, try changing it to Retain.

MirtoBusico · December 30, 2024, 10:00am

You are right: the reclaimpolicy is delete

sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$ kubectl describe storageclasses longhorn
Name:            longhorn
IsDefaultClass:  Yes
Annotations:     longhorn.io/last-applied-configmap=kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: "Delete"
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "30"
  fromBackup: ""
  fsType: "ext4"
  dataLocality: "disabled"
  unmapMarkSnapChainRemoved: "ignored"
,storageclass.kubernetes.io/is-default-class=true
Provisioner:           driver.longhorn.io
Parameters:            dataLocality=disabled,fromBackup=,fsType=ext4,numberOfReplicas=3,staleReplicaTimeout=30,unmapMarkSnapChainRemoved=ignored
AllowVolumeExpansion:  True
MountOptions:          <none>
ReclaimPolicy:         Delete
VolumeBindingMode:     Immediate
Events:                <none>
sysop@h5a-dev:~/h5a/software/pcams/storehouse/cratedb/attachments$

Now I’ll try to change this and will report here.
Thanks

MirtoBusico · December 30, 2024, 10:06am

I see that the longhorn chart says:

 # -- Reclaim policy that provides instructions for handling of a volume after its claim is released. (Options: "Retain", "Delete")
  reclaimPolicy: Delete

So, if I understand correctly, when CrateDB closes it releases the PVC?

Is this correct?

MirtoBusico · December 30, 2024, 5:35pm

Hi @hernanc I’m confused.

Seems that longhorn cannot change the reclaimPolicy without reinstalling everything.

But if I look at the PVC I see that the Cratedb volumes were created 25 days ago.
If I understand correctly, the PV were not deleted and recreated at every startup

NAMESPACE     NAME                                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
kube-system   docker-registry                     Bound    pvc-b375793c-ed9c-4335-9605-4e490e1e5956   10Gi       RWO            longhorn       <unset>                 238d
apisix        data-apisix-etcd-1                  Bound    pvc-9433785a-2c42-41fa-8be4-5e3cadb5123d   8Gi        RWO            longhorn       <unset>                 170d
apisix        data-apisix-etcd-2                  Bound    pvc-0279c77c-73df-4ece-a461-a3d091dd7353   8Gi        RWO            longhorn       <unset>                 170d
apisix        data-apisix-etcd-0                  Bound    pvc-c4571652-2a78-4f1c-aa12-c08432a20805   8Gi        RWO            longhorn       <unset>                 170d
for-crate     debug-crate-data-hot-my-cluster-2   Bound    pvc-8b74ff39-d2bc-448f-a069-16639314ff2c   4Gi        RWO            longhorn       <unset>                 25d
for-crate     debug-crate-data-hot-my-cluster-0   Bound    pvc-505b1dbd-65fb-4405-933f-1b0a3119e38c   4Gi        RWO            longhorn       <unset>                 25d
for-crate     data0-crate-data-hot-my-cluster-2   Bound    pvc-aa3b74a4-ccd0-45cb-bd41-9908c46e45dc   16Gi       RWO            longhorn       <unset>                 25d
for-crate     data0-crate-data-hot-my-cluster-1   Bound    pvc-a6b82e5a-c1fb-4db0-b3bb-062cf125fe94   16Gi       RWO            longhorn       <unset>                 25d
for-crate     data0-crate-data-hot-my-cluster-0   Bound    pvc-4ebd3a0f-1249-4448-b5d8-ef130c661bda   16Gi       RWO            longhorn       <unset>                 25d
for-crate     debug-crate-data-hot-my-cluster-1   Bound    pvc-43445265-3ce8-4142-8803-00d4434d43e8   4Gi        RWO            longhorn       <unset>                 25d
sysop@h5a-dev:~$

Am I wrong?

BTW the etcd and docker-registry volumes survive at every restart and for all the reclaimPolicy is set to Delete

MirtoBusico · December 30, 2024, 6:48pm

Hi @hernanc the situation is more strange than I supposed: only the blobs are lost. The doc tables are still there.

I’ll change also the thread title

Before restart

cr> SELECT count(*) from community_areas;
+----------+
| count(*) |
+----------+
|       77 |
+----------+
SELECT 1 row in set (0.003 sec)
cr> select * from blob.attachments;
+------------------------------------------+---------------+
| digest                                   | last_modified |
+------------------------------------------+---------------+
| acfa51b963b5a1fb9eecaf7954ae39c9bea736cc | 1735582735276 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735582803476 |
+------------------------------------------+---------------+
SELECT 2 rows in set (0.012 sec)
cr>

After the restart

cr> SELECT count(*) from community_areas;
+----------+
| count(*) |
+----------+
|       77 |
+----------+
SELECT 1 row in set (0.013 sec)
cr> select * from blob.attachments;
+--------+---------------+
| digest | last_modified |
+--------+---------------+
+--------+---------------+
SELECT 0 rows in set (0.020 sec)
cr>

Completely lost. I don’t know what to do

hernanc · December 31, 2024, 8:28am

Hi again @MirtoBusico , you are right , thank you for testing further and coming back to us. It seems indeed this may not be well handled in the operator, I have raised it with the relevant team, in the meanwhile I think you will need to manually provision a location to store the blobs and then configure blobs.path

MirtoBusico · December 31, 2024, 11:18am

Hi @hernanc I’m trying to find a path usable for blobs; but I find some difficulties.

In the documentation I find:

blobs.path
Runtime: no

Path to a filesystem directory where to store blob data allocated for this node.

By default blobs will be stored under the same path as normal data. A relative path value is interpreted as relative to CRATE_HOME.

But, looking into one of the crate POD console I see that the CRATE_HOME variable is not set

/ $ echo $CRATE_HOME

/ $

Also I was not able to find where the PV is mounted

/ $ mount
overlay on / type overlay (rw,relatime,lowerdir=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2061/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2060/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2059/fs,upperdir=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2785/fs,workdir=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2785/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup type cgroup2 (ro,nosuid,nodev,noexec,relatime)
/dev/vda1 on /config type ext4 (ro,relatime,errors=remount-ro)
/dev/vda1 on /etc/hosts type ext4 (rw,relatime,errors=remount-ro)
/dev/vda1 on /dev/termination-log type ext4 (rw,relatime,errors=remount-ro)
/dev/vda1 on /etc/hostname type ext4 (rw,relatime,errors=remount-ro)
/dev/vda1 on /etc/resolv.conf type ext4 (rw,relatime,errors=remount-ro)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k,inode64)
tmpfs on /var/run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime,size=8130820k,inode64)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/asound type tmpfs (ro,relatime,inode64)
tmpfs on /proc/acpi type tmpfs (ro,relatime,inode64)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /sys/firmware type tmpfs (ro,relatime,inode64)
/ $

The env command says

/ $ env
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.43.0.1:443
HOSTNAME=crate-data-hot-my-cluster-0
SHLVL=1
HOME=/home
CRATE_MY_CLUSTER_PORT_4200_TCP_ADDR=10.43.47.153
CRATE_MY_CLUSTER_SERVICE_PORT_HTTP=4200
CRATE_MY_CLUSTER_SERVICE_PORT_PSQL=5432
CRATE_MY_CLUSTER_PORT_4200_TCP_PORT=4200
CRATE_MY_CLUSTER_PORT_4200_TCP_PROTO=tcp
TERM=xterm
KUBERNETES_PORT_443_TCP_ADDR=10.43.0.1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
CRATE_MY_CLUSTER_PORT_5432_TCP_ADDR=10.43.47.153
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
CRATE_MY_CLUSTER_SERVICE_HOST=10.43.47.153
CRATE_MY_CLUSTER_PORT_5432_TCP_PORT=5432
CRATE_MY_CLUSTER_PORT_5432_TCP_PROTO=tcp
CRATE_MY_CLUSTER_PORT_4200_TCP=tcp://10.43.47.153:4200
KUBERNETES_SERVICE_PORT_HTTPS=443
CRATE_MY_CLUSTER_SERVICE_PORT=4200
CRATE_MY_CLUSTER_PORT=tcp://10.43.47.153:4200
KUBERNETES_PORT_443_TCP=tcp://10.43.0.1:443
KUBERNETES_SERVICE_HOST=10.43.0.1
PWD=/
CRATE_MY_CLUSTER_PORT_5432_TCP=tcp://10.43.47.153:5432

So my questions:

what is the path used for the doc tables?
can this path be used also for blobs?
where the PVC is mounted?

Sorry but I’m lost

UPDATE
I discovered that the crate node pod have 2 containers: sql-exporter and crate
The previous data are from sql-exporter.

Looking at the crate container i see that mount reports

[root@crate-data-hot-my-cluster-0 data]# mount
overlay on / type overlay (rw,relatime,lowerdir=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2057/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2056/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2055/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2054/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2053/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2052/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2051/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2050/fs:/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2049/fs,upperdir=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2786/fs,workdir=/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/2786/work)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (ro,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup type cgroup2 (ro,nosuid,nodev,noexec,relatime)
/dev/longhorn/pvc-505b1dbd-65fb-4405-933f-1b0a3119e38c on /resource type ext4 (rw,relatime)
/dev/vda1 on /data type ext4 (rw,relatime,errors=remount-ro)
/dev/longhorn/pvc-4ebd3a0f-1249-4448-b5d8-ef130c661bda on /data/data0 type ext4 (rw,relatime)
/dev/vda1 on /etc/hosts type ext4 (rw,relatime,errors=remount-ro)
/dev/vda1 on /dev/termination-log type ext4 (rw,relatime,errors=remount-ro)
/dev/vda1 on /etc/hostname type ext4 (rw,relatime,errors=remount-ro)
/dev/vda1 on /etc/resolv.conf type ext4 (rw,relatime,errors=remount-ro)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=65536k,inode64)
/dev/vda1 on /var/lib/crate/crate-jmx-exporter-1.2.0.jar type ext4 (rw,relatime,errors=remount-ro)
tmpfs on /run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime,size=8130820k,inode64)
proc on /proc/bus type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/fs type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/irq type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sys type proc (ro,nosuid,nodev,noexec,relatime)
proc on /proc/sysrq-trigger type proc (ro,nosuid,nodev,noexec,relatime)
tmpfs on /proc/asound type tmpfs (ro,relatime,inode64)
tmpfs on /proc/acpi type tmpfs (ro,relatime,inode64)
tmpfs on /proc/kcore type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/keys type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /proc/timer_list type tmpfs (rw,nosuid,size=65536k,mode=755,inode64)
tmpfs on /sys/firmware type tmpfs (ro,relatime,inode64)
[root@crate-data-hot-my-cluster-0 data]#

Seems that the data PV is mounted on “/data/data0/”

/dev/longhorn/pvc-4ebd3a0f-1249-4448-b5d8-ef130c661bda on /data/data0 type ext4 (rw,relatime)

I’ll try to create the blobs under this path and I’ll report here.

MirtoBusico · December 31, 2024, 1:05pm

Partial success

I recreated the blob table specifying the path

cr> drop blob table attachments;
DROP OK, 1 row affected (0.158 sec)
cr> create blob table attachments with (blobs_path='/data/data0/myblobs/attachments');
CREATE OK, 1 row affected (0.298 sec)
cr>

Then I added 2 blobs and I see

cr> select count(*) from community_areas;
+----------+
| count(*) |
+----------+
|       77 |
+----------+
SELECT 1 row in set (0.018 sec)
cr> select * from blob.attachments;
+------------------------------------------+---------------+
| digest                                   | last_modified |
+------------------------------------------+---------------+
| 135b6d193c9acba8ea180dea7424b863ce52858b | 1735648816097 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735648828370 |
+------------------------------------------+---------------+
SELECT 2 rows in set (0.012 sec)
cr>

Then I stopped the cluster and restarted it and I still see

cr> select count(*) from community_areas;
+----------+
| count(*) |
+----------+
|       77 |
+----------+
SELECT 1 row in set (0.015 sec)
cr> select * from blob.attachments;
+------------------------------------------+---------------+
| digest                                   | last_modified |
+------------------------------------------+---------------+
| 135b6d193c9acba8ea180dea7424b863ce52858b | 1735648816097 |
| 9a3993e2ecbbb3b1ea1435650cedd7fec2454e73 | 1735648828370 |
+------------------------------------------+---------------+
SELECT 2 rows in set (0.017 sec)

So it seems to work.

Thanks for your time.

P.S. I still receive occasionally a “HTTP/1.1 307 Temporary Redirect” error when uploading a blob. But for this I’ll open a new thread

Topic		Replies	Views
“HTTP/1.1 307 Temporary Redirect” error when uploading a blob CrateDB data-storage	6	32	January 4, 2025
Using crate-operator the installation fails with "AttachVolume.Attach failed for volume" Installation data-storage	6	90	November 30, 2024
Blob support removal for next version CrateDB	6	934	June 22, 2021
CrateDB shows no tables CrateDB	5	797	May 26, 2021
3 nodes cluster suddenly failing CrateDB	10	1110	July 22, 2021

Panic: restarting the kubernetes cluster loses all blobs but retains the doc tables

Related topics