Immediate goals
- mirror each of our servers' backups on each of our servers;
- hand out customized Debian-Live CDs to our friends (including lost Windows-using souls), and instruct them to boot it on a regular basis, so that their data are at least moderately safe.
Operation
Two kinds of operations occurs :
the hosts which publish backup service requests have to create the corresponding backups and upload them somewhere where backup service is provided;
those which provide backup service must be ready to accept new backups or updates, and spread those to their fellow service providers.
Note that creating and uploading backups happens at the same time, so that no disk space will be used on hosts which aren't configured to keep local copies. On the other hand, if a given host is configured to serve its own backups, "uploading" will be a simple local copy of the relevant files, and actual uploading will happen as the "spread newly updated" operation.
Live CD
The Live CD would first scan the LDAP directory to identify the host it's running on (for instance using the hardware address of its network interfaces). A new entry can be added automatically if no object is found, either by using credentials present on the LiveCD (if we trust the poeple we hand them to enough), or by asking them to the user.
The local filesystem would be mounted under /media
,
and backups would be created and uploaded
as specified by the configuration from the directory.
The configured backup updates would be retreived as well
if the host is configured to service any backups.
Program feature listing
Backup "find" : A local find is launch with search restriction to get the full list of file to backup. A rsync is run without recursive mode to get only this liste of files (I dunno if it's possible)
Backup "rsync" : The config list the directory to get (with exclusion files)
Backup "tar+gpg" : The targeted files or directories are get (I dunno how) and crypted with gpg (on client or server side?) with one or more gpg public key.
Backup "music" : Similar to other but use to store mp3 or divx, ... these kind of data is not really critical. This files will be marked as deletable if backup service need disk space.
Configuration
The creation and distribution of a backup is requested
by creating an entry with the patnetBackup
object class
below the entry of the host which will create the backup.
The backups which will be mirrored by a given host
are configured with the auxiliary object class patnetBackupService
On the publishing and mirroring systems, the backups are named
after their distinguished name under /srv/backup
,
with a file extension depending on their patnetBackupKind
attribute.
For instance, a tarball backup named "important" from the host
foo.example.com
might be named
/srv/backup/cn=important,cn=foo,ou=hosts,dc=example,dc=com.tar.bz2
.
Service request
patnetBackup
objects must contain the following attributes:
cn
identifies the backup within the context of the publishing host;patnetBackupPath
points to a directory on the publishing host which should be scanned for files to backup. The behavior when multiplepatnetBackupPath
are specified is ill-defined and this should not be done for now.
They may also contain the following attributes:
patnetBackupKind
specifies the kind of backup which should be made, and can be one of the following strings:tarball
(the default) will request the creation of a tarball containing the backed up files; the tarball will either be> ${destpath}
when "uploaded" to the local system, or| ssh ${desthost} /bin/sh -c 'cat > ${destpath}'
when uploaded to a remote system.rsync
will use (recursive) rsync to move a whole directory to its destination (either local or remote).
patnetBackupTarExclude
s will be passed totar
as--exclude
options in an unspecified order.patnetBackupRsyncInclude
s will be passed torsync
as--include
options in an unspecified order.patnetBackupRsyncExclude
s will be passed torsync
as--exclude
options, after the--include
ones, in an otherwise unspecified order.patnetBackupService
specifies a search base when looking for service providers for the backup; the default is to search from the root of the configuration tree.patnetBackupUploadTo
specifies a backup service to upload the backup to when it is initially created; the default is to choose a service provider at random, or use the local host if it is one of them.patnetSSHPubKey
indicates the ssh key that will be used to (initially) upload the backup, which should be authorized on our backup service providers.patnetBackupPriority
A hierarchical structure could be used, where leaves would be backups and internal nodes would specify configuration options for all their children.
The presence of such a service request in the configuration subtree of a given host instructs both the host in question to realize and push (when applicable) such backups, and serves as a service request to the hosts listed as backup mirrors.
Service declaration
For each host willing to host some backups :
- (groups of) backups we are willing to mirror
The service declaration entry could double as an HTTP service declaration, so that (encrypted) backups are made available to retreive over HTTP.
Details
Groups of hosts and backups would probably be specified
by an LDAP search (ie. so you can indicate you're willing
to mirror backup for dc=patnet,dc=eu,dc=org
and subtree).
LDAP aliases can be used for maximum flexibility /
automatic load distribution.
Clients would push / servers would pull backups which are included in both the service request and declaration.
Roadmap
Version 0.1
- the immediate goals are implemented as an ugly hack.
Version 0.2
- Secure transfert and access : Public-key cryptography is used for authentication and privacy of the backup data; participants get some trusted path between each other for getting public keys and a bunch of them is stored on the CDs handed to the lost souls (as well their private key.)
Version 0.3
Version 0.4
- Use Prority of backuped files For exemple : The Backup "music" have a low priority backup files.
Version 0.5
Version 0.6
Version 0.7
Version 0.8
- Montoring and Email reminder : Some servers start sending emails when they receive no backup updates from configurable participants in a configurable amount of time. This is intended both a "backup reminder" and so that backup infrastructure/network problems don't go unnoticed.
Version 0.9
- Security consideration : The codebase is sufficiently stable for further versions to remain moderately backwards-compatible. We can disable the "code update" functions on the CDs we hand so that PKC cannot be trivially circumvented by the trusted servers.