mirror of
https://github.com/ente-io/ente.git
synced 2025-07-16 03:32:58 +00:00
173 lines
5.3 KiB
Markdown
173 lines
5.3 KiB
Markdown
# Copycat DB
|
|
|
|
Copycat DB is a [service](../services/README.md) to take a backup of our
|
|
database. It uses the Scaleway CLI to take backups of the database, and uploads
|
|
them to an offsite bucket.
|
|
|
|
This bucket has an object lock configured, so backups cannot be deleted before
|
|
expiry. Conversely, the service also deletes backups older than some threshold
|
|
when it creates a new one to avoid indefinite retention.
|
|
|
|
In production the service runs as a cron job, scheduled using a systemd timer.
|
|
|
|
> These backups are in addition to the regular snapshots that we take, and are
|
|
> meant as a second layer of replication. For more details, see our
|
|
> [Reliability and Replication Specification](https://ente.io/reliability).
|
|
|
|
## Quick help
|
|
|
|
View service status (it gets invoked as a timer automatically, doesn't need to
|
|
be started/stopped manually):
|
|
|
|
```sh
|
|
sudo systemctl status copycat-db
|
|
```
|
|
|
|
View logs locally (they'll also be available on Grafana):
|
|
|
|
```sh
|
|
sudo tail /root/var/logs/copycat-db.log
|
|
```
|
|
|
|
## Name
|
|
|
|
The name copycat-db is a riff on "copycat", which is what we call our museum
|
|
instance that does the object replication. This one replicates the DB, so,
|
|
copycat-db.
|
|
|
|
## Required environment variables
|
|
|
|
##### SCW_CONFIG_PATH
|
|
|
|
Path to the `config.yaml` used by Scaleway CLI.
|
|
|
|
This contains the credentials and the default region to use when trying to
|
|
create and download the database dump.
|
|
|
|
If needed, this config file can be generated by running the following commands
|
|
on a shell prompt in the container (using `./test.sh sh`)
|
|
|
|
scw init
|
|
scw config dump
|
|
|
|
##### SCW_RDB_INSTANCE_ID
|
|
|
|
The UUID of the Scalway RDB instance that we wish to backup. If this is missing,
|
|
then the Docker image falls back to using `pg_dump` (as outlined next).
|
|
|
|
##### PGUSER, PGPASSWORD, PGHOST
|
|
|
|
Not needed in production when taking a backup (since we use the Scaleway CLI to
|
|
take backups in production).
|
|
|
|
These are used when testing a backup using `pg_dump`, and when restoring
|
|
backups.
|
|
|
|
##### RCLONE_CONFIG
|
|
|
|
Location of the config file, that contains the destination bucket where you want
|
|
to use to save the backups, and the credentials to to access it.
|
|
|
|
Specifically, the config file contains two remotes:
|
|
|
|
- The bucket itself, where data will be stored.
|
|
|
|
- A "crypt" remote that wraps the bucket by applying client side encryption.
|
|
|
|
The configuration file will contain (lightly) obfuscated versions of the
|
|
password, and as long as we have the configuration file we can continue using
|
|
rclone to download and decrypt the plaintext. Still, it is helpful to retain the
|
|
original password too separately so that the file can be recreated if needed.
|
|
|
|
A config file can be generated using `./test.sh sh`
|
|
|
|
rclone config
|
|
rclone config show
|
|
|
|
When generating the config, we keep file (and directory) name encryption off.
|
|
|
|
Note that rclone creates a backup of the config file, so Docker needs to have
|
|
write access to the directory where it is mounted.
|
|
|
|
##### RCLONE_DESTINATION
|
|
|
|
Name of the (crypt) remote to which the dump should be saved. Example:
|
|
`db-backup-crypt:`.
|
|
|
|
Note that this will not include the bucket - the bucket name will be part of the
|
|
remote that the crypt remote wraps.
|
|
|
|
##### Logging
|
|
|
|
The service logs to its standard out/error. The systemd unit is configured to
|
|
route these to `/var/logs/copycat-db.log`.
|
|
|
|
## Local testing
|
|
|
|
The provided `test.sh` script can be used to do a smoke test for building and
|
|
running the image. For example,
|
|
|
|
./test.sh bin/bash
|
|
|
|
gives us a shell prompt inside the built and running container.
|
|
|
|
For more thorough testing, run this service as part of a local test-cluster.
|
|
|
|
## Restoring
|
|
|
|
The service also knows how to restore the latest backup into a Postgres
|
|
instance. This functionality by a separate service (Phoenix) to periodically
|
|
verify that the backups are restorable.
|
|
|
|
To invoke this, use "./restore.sh" as the command when running the container
|
|
(e.g. `./test.sh ./restore.sh`). This will restore the latest backup into the
|
|
Postgres instance whose credentials are provided via the various `PG*`
|
|
environment variables.
|
|
|
|
## Preparing the bucket
|
|
|
|
The database dumps are stored in a bucket that has object lock enabled
|
|
(compliance mode), and has a default bucket level retention time of 30 days.
|
|
|
|
## Deploying
|
|
|
|
Ensure that promtail is running, and is configured to scrape
|
|
`/root/var/logs/copycat-db.log`.
|
|
|
|
Create that the config and log destination directories
|
|
|
|
sudo mkdir -p /root/var/config/scw
|
|
sudo mkdir -p /root/var/config/rclone
|
|
sudo mkdir -p /root/var/logs
|
|
|
|
Create the env, scw and rclone configuration files
|
|
|
|
sudo tee /root/copycat-db.env
|
|
sudo tee /root/var/config/scw/copycat-db-config.yaml
|
|
sudo tee /root/var/config/rclone/copycat-db-rclone.conf
|
|
|
|
Add the service definition, and start the service
|
|
|
|
scp copycat-db.{service,timer} instance:
|
|
|
|
sudo mv copycat-db.{service,timer} /etc/systemd/system
|
|
sudo systemctl daemon-reload
|
|
|
|
To start the cron job
|
|
|
|
sudo systemctl start copycat-db.timer
|
|
|
|
The timer will trigger the service on the specified schedule. In addition, if
|
|
you wish to force the job to service immediately
|
|
|
|
sudo systemctl start copycat-db.service
|
|
|
|
## Updating
|
|
|
|
To update, run the
|
|
[GitHub workflow](../../.github/workflows/copycat-db-release.yaml) to build and
|
|
push the latest image to our Docker Registry, then restart the systemd service
|
|
on the instance
|
|
|
|
sudo systemctl restart copycat-db
|