Since July 2019, we use borg to backup most of our servers. Before that, we used custom scripts running on the server "cleve". On this page, the setup is described, and you will find information on how to add a new server to the backup process, and restore data if necessary.
The principle is simple: Each servers cares for its own backups, and can only access its own backups. Every night, a cron job triggers a wrapper script (borg-wrapper) which runs borg to make a backup to a remote storage. The server identifies on the remote storage with an SSH key. With this key, the server can only modify its own backups.
A log file for each backup cron run is sent to the FSFE admin+backup@ address. It contains the --stats --verbose output of the borg create and prune commands, as well as the output of any pre- or post-backup commands, by default a list of all current borg archives.
We automatically run regular snapshots of the complete remote storage which cannot be modified by the individual servers. We also use a sub-account to restrict access to other directories on the remote storage. This way, we have some safety if the server is compromised.
Add new server
If you created a new server, please configure its backup immediately! In order to do so, clone the git repo fsfe-backup (access restricted to System Hackers). It contains an Ansible playbook which takes care of the following steps:
- install borg and a SMTP server
- generate SSH key and store it on the remote storage
- create a borg backup repository on the remote storage
- setup wrapper script and cronjob on server
To add the new server, add the server's FQDN (e.g. server.fsfe.org) to the file inventory in the client section.
Afterwards, run ansible-vault create host_vars/server.fsfe.org to create the configuration for the new server. In the new file, insert the new password used to encrypt backups:
--- borgbackup_passphrase: aLongAndSecurePassword
Note that the password for these encrypted files is stored in vaultpw.pgp, a GPG encrypted file which only a few people have access to. Please only add new passwords (and servers) if you can open this file. In the new file, you can also overwrite default variables.
Now you can run the playbook: ansible-playbook -i inventory -l server.fsfe.org backup.yml. Note that if you run this for multiple servers at once, please add -f 1 as a parameter. Otherwise, rewriting the authorized_keys file on the remote storage might cause issues if accessed by multiple processes at once.
If everything went fine, please commit and push your changes to Git!
Run scripts before backup
It may be useful to run commands right before the backup, e.g. to dump databases.
By default, the wrapper script runs /root/bin/backup.sh before the backup. Add your commands in this script file and make it executable. For example, it could look like this:
echo "Dump Postgres databases" sudo -u fsfe -- sh -c "cd / ; pg_dump > /home/fsfe/postgres_dump.sql"
The output will be mailed with the borg logs to the specified email address.
To avoid that you have to type in the remote storage's password for some operations, add your SSH key to the authorized_keys file. Please note that they have to be in RFC4716 format, see this page.
The playbook configures the cron job to send an email. In order to test whether your server actually sends emails correctly, run ls | sendmail email@example.com. If nothing arrives in your mail account (also check spam!), please debug.
Restore from backup
In order to restore content of a specific borg archive, you can mount it using the wrapper script:
Create a mountpoint: mkdir /mnt/borg
Find the archive you would like to mount: borg-wrapper list
Mount the archive to the local mountpoint: borg-wrapper mount 20190716-1255 /mnt/borg
Now, you can search and copy the desired data. Afterwards, please unmount the mointpoint with umount /mnt/borg.
Extraction of a complete archive is not supported by the wrapper script as of today. But by inspecting its content in /usr/local/bin/borg-wrapper, it should be fairly easy to build the command for this.
To remove all files generated by the Ansible playbook, e.g. in order to disable backups of a client, follow these steps:
On the client, remove the files /etc/cron.d/borg-backup, /root/.borg.passphrase, /root/.ssh/id_borg_rsa*, and /usr/local/bin/borg-wrapper
On the client, clear the modified file /root/.ssh/config from the part added by the playbook. If that's the only content of the file, you can remove it entirely.
On the remote storage, remove the client's SSH key from the .ssh/authorized_keys file. Please note to take the correct file, from the main account's root directory it is /BorgBackup/.ssh/authorized_keys.
On the remote storage, delete the client's directory if you want to remove all backups as well. From the main account's root directory it is /BorgBackup/servers/server.fsfe.org. However, it is recommended to not delete the old backups of the server was used in production in case someone needs some information after the deletion.
Mail from cron job not sent
This can happen both with exim4 and postfix:
exim4: In the log files you could see R=nonlocal: Mailing to remote domains not supported. To solve it, either configure exim to support remote domains (dc_eximconfig_configtype='internet') or use a relay server for those, or install nullmailer instead if exim4 is not used for anything else. This program is much lighter and works by default.
postfix: The log files show <firstname.lastname@example.org>: Sender address rejected: Domain not found which is a sign that the receiving (or relaying) mail server blocked the email because the host's domain is invalid. You can solve this by manually setting the domain postfix shall use: mydomain = fsfeurope.org (or fsfe.org, depending on the FQDN of the server).