Backup Infrastructure

From Tardis
Revision as of 17:40, 24 February 2006 by Pert (talk)
Jump to: navigation, search

The other half of the raid array for the homedirs in dalek is sitting in brigadier. We do not have any form of backups of homedirs in place, other than a copy from about a year ago.

Same applies to the webserver, although it's raid is in place, the disk has some very bizarre filesystem errors (re: ~kate/qcmweb)

Things to backup

User data

This is stuff we just can't get back/rebuild

LDAP
Currently backed up to /root/ldap/backup on baker (the LDAP server). This should backup elsewhere, for obvious reasons.
Home dirs
RAIDed again but no regular backups as yet.
Mail dirs
Similarly
Web dirs
Raid-Mirrored. Not backed-up.
Databases
Not backed up. We should probably do SQL dumps, rather than dumps of the raw db backened.

Other stuff

System configurations
LCFG proved too much of a beast to handle. Best methods for doing this, suggestions?
Tarballing /etc would help in a lot of cases, but isn't a proper solution.
System logs
There's probably some sense in this, if we get haX0rised, but that's probably OTT.
Might be worth sending syslogs to a log server, which only l33t admins can access? - sjh
Seems such a facility already existed on dalek, it just hadn't been used since early 2004 - this was like 300MB of logs a day - would want to seriously bitshift this down to reasonable size if we want to consider storing them - Seth 18:28, 29 January 2006 (GMT)
The wiki
MediaWiki has facilities for this - probably want to automate... - From what I remember from playing with mediawiki, this is best accomplished with a mysqldump of the wikidb database - xhosa

Solution

Pert has started installing a new machine - Tennant (CNAME backup.tardis) which contains one 120GB disk in LVM.

This was going to run BackupPC but this does not allow users to restore thier own files and does not seem to play well with rsync. Therefore I've been testing a new solution, using rsync to create hard-link-based incremental backups. The script I've been writing is currently in /root/ on piper and so far this has only been tested on the home directories.

Architecture

Backups will be pushed from the various servers (file, web, etc.) to backup.t using rsync, some scripts and cron. Rsync's '--link-dest' argument will be used to create hard-link-based incremental backups on the LVMed space in backup.t.

As it would be tricky to monitor the backups on each of the servers using the backup service, (file, web, etc.) monitoring will be done by a script on backup.t. It should mail daily reports to a specific person (WV me (pert)) who will notice and fix things if the mails stop. These mails should contain disk usage info from backup.t and warn about missed backups. This script will probably also be responsible for removing backups as they get old.

Backups of home directories and mail will be shared out to the shell server(s) using NFS so users can restore their own files.

Here's an example of the directory structure used on backup.t:

/export/
/export/home
/export/home/2006-02-23
/export/home/2006-02-23/newusers
...
/export/home/2006-02-24
/export/home/2006-02-24/newusers
...
/export/mail/2006-02-23
/export/mail/2006-02-23/pert
...
/export/mysql
...
/export/psql
...
...