I have been trying to find information about how to back up CrateDB when it runs in a Kubernetes cluster using the tool Velero for backups.
I am concerned about how to ensure that you have a consistent backup of CrateDB if you do snapshots of volumes where the database files reside.
Does anyone have an example of doing backup of CrateDB using Velero?
thank you for reaching out to us!
Snapshot Backups of the underlying PV (disk snapshotting) are problematic.
We recommend using the built-in Backup, which takes care of cluster coordination, allows for setting Full Backups, Differentials on a repository level, single table restore, restore to the same cluster, or another cluster.
There are several options for the backup repository (eg. a S3 bucket), depending on your requirements.
Please let me know if there are any specific question where we could help you.
I will look into the built-in backup some more.
let me know if there is anything around this we can help you with. It should be quite straight forward.
What you could do:
- is setup a k8s cronjob,
- on the backup it self, you would start by adding a backup repository. In case of k8s best option would be to write to an S3 compatible endpoint outside the k8s cluster, or eventually an azure storage blob. Minio would be an option here also, which you could eventually run inside k8s.
- then create snapshot is next (this is probably the statement run by cronjob
- to monitor it, you could fancy
sql_exporter and expose the lastbackup as a metric eg. for prometheus GitHub - burningalchemist/sql_exporter: Database agnostic SQL exporter for Prometheus
create snapshot does trigger a FULL, with subsequent incremental backups. By deleting the oldest backup you can maintain backup versions. Think about each snapshot like a virtual full. I am getting to fancy now
Hope that helps.