Home » Debian, Featured, Headline, Linux, Recovery, Shell

Subversion backup

26 March 2009 15 Comments

Subversion Backup

Subversion Backup

Maintaining a subversion repository can be a hard task if you don’t work with the right tools. Today, when technology advance daily, we are still vulnerable to “fate”: crashed harddrives, corrupt RAM, network outages, power failure or other “evil” problems. Even if you are a conscientious administrator you will not be missed by “Murphy Laws”, only if you are a “bastard” lucky guy (I’m a bastard lucky administrator from hell but you are not). In the next article I will try to show you, how-to create a good backup of your subversion repositories.

We can create two types of backups: incremental and full. It is up to you to decide which files need what kind of backup! You can use online storage of course, so no need to worry about emails, files and photos saved, foxy or anything else you access online. Important documents, family photos, work stuff and the like might need more heavy duty backups.

You have 2 possibilities to create incremental backups: subversion hooks and scheduling them with crontab. If you are paranoid you can use subversion hooks and you can create a incremental backup at every commit (Do you really want that ?). Depending by your system and your repository size, doing backups at every commit can cost you enormously, and if your hardware resources are not so “advanced” I don’t recommend you to do that.

Doing incremental backup at every commit for paranoid people.

Subversion have something beautiful functionality named hooks. This hooks are triggered by repository events such as:

pre and post new revision
pre,post, start commit
pre, post lock, unlock repository

and probably other.

To fetch a subversion incremental backup after a commit we should add a post-commit script file in all hook folders from your repositories. For example if you have your repositories in /home/svn/ then your hook folders should be in /home/svn/[repositories_names]/hooks/ . post-commit file will be called by subversion server every time when someone is finishing to commit on server. Also post-commit script file is called with 2 parameters: repository name and repository revision.

post-commit script

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/bash
 
BACKUP_DIR="/var/backups/svn/"
REPOSITORY_DIR="/home/svn/repository"
REPO=$1
REV=$2
 
if [ ! -d $BACKUP_DIR"/"$REPO ]; then
        mkdir -p $BACKUP_DIR"/"$REPO
    fi
 
 
svn-backup-dumps -r $REV $REPOSITORY_DIR"/"$REPO $BACKUP_DIR"/"$REPO

Copy post-commit script in /home/svn/*/hooks/ and create /var/backups/svn/

Now you are ready … just commit a file to repository.

Doing incremental backups every night.

This configuration is simple. We just need to add a script in /etc/cron.daily/ and backup will be taken every morning at 6 (under my Debian system).
The script will search in repository folder to find every repository what we have configured. After it will backup this repositories.

inc-backup.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/bash
 
BACKUP_DIR="/backup/repository-inc/"
REPOSITORY_DIR="/home/svn/repository"
 
#search in repository folder to find all the repository names
ls -Al --time-style=long-iso $REPOSITORY_DIR | grep '^d' | awk '{print $8}' | while read line
do
#create the backup repository folder if we don't have it.
    if [ ! -d $BACKUP_DIR"/"$line ]; then
        mkdir -p $BACKUP_DIR"/"$line
    fi
 
#takeing backup (10 revisions in 1 dump)
    svn-backup-dumps --deltas -z -c 10 $REPOSITORY_DIR"/"$line $BACKUP_DIR"/"$line
done

The script will calculate delta between backups and repository and it will backup only the difference. Also will compress the resulting file (-z) what will contain 10 subversion revisions (-c 10). I think is important to have dumps very granulated and not 1000 revisions in 1 file, but this is your decision.

To recover from a incremental backup just run

svnadmin load /path/to/reponame < /var/backup/svn/repo/repo1.dump

Doing full backups

The story is the same as the above section, just the script is different and cron folder is /etc/cron.weekly/. Also for full backup we will use the svn-hot-backup script provided by svn. Hot backups are not easy to do because doing a straight copy of the repository can generate a faulty backup unless you cut all users access (what is not such a good thing to do). Resulting backup is a full working copy of the Subversion repository what can be deployed (just copy) over a crashed one.

full-backup.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash
 
BACKUP_DIR="/backup/repository"
REPOSITORY_DIR="/home/svn/repository"
#search in repository folder to find all the repository names
ls -Al --time-style=long-iso $REPOSITORY_DIR | grep '^d' | awk '{print $8}' | while read line
do
    if [ ! -d $BACKUP_DIR"/"$line ]; then
        mkdir $BACKUP_DIR"/"$line
    fi
 
    #Getting revision number
    REVISION=`cat $REPOSITORY_DIR"/"$line"/db/current" | awk '{print $1}'`
 
    #Archive last backup
    tar -czf $BACKUP_DIR"/"$line"-last.tar.gz" $BACKUP_DIR"/"$line"/"
 
    #Dangerous :)
    if [ -n "$BACKUP_DIR" ]; then
	rm -rf $BACKUP_DIR"/"$line"/*"
    fi
 
    #Check to see if exists a hot backup
    if [ -d $BACKUP_DIR"/"$line"/"$line"-"$REVISION ]; then
        echo "Skipping Backup ! Backup Already Exists"
    else
        echo "Doing backup for "$line
        /usr/bin/svn-hot-backup $REPOSITORY_DIR"/"$line $BACKUP_DIR"/"$line
    fi
 
done

The script will also archive the latest full backup before taking a new one.

My recommendation is to archive all the files over a network on a NFS server and don’t backup them on the same server.Also don’t forget to get a full backup even if you already have incremental backups, because the best solution is a diversified one.

Good luck.


15 Comments »

  • xxxx said:

    Just a little remark:
    ls -lA

    My ubuntu machine(more or less default setup)
    ls -lA gives different output if the script is invoked from crontab or shell.

    The print $8 refers to wrong column in the ls command.
    Therefore put in the parameter –time-style=XXXX for ls command.

  • admin (author) said:

    Thanks for your remark. Is true. I think for different type of distributions default “ls” parameters are not the same. Debian is using
    “–time-style=long-iso” by default.

    Regards

  • millibyte said:

    Thanks for providing this info – just what I needed!

  • millibyte said:

    And here’s my little contribution in the form of two fixes for the full-backup.sh script:

    1) Change the ln -Al line to use $REPOSITORY_DIR
    2) The line
    if [ $BACKUP_DIR !="" ]; then
    causes an error (at least on my Ubuntu system). A better line would be:
    if [ -n "$BACKUP_DIR" ]; then

    Cheers

  • admin (author) said:

    Yes, true! I discovered this problem later and I changed the code with
    [code]
    if [ “$BACKUP_DIR” !=”” ]; then
    [/code]

    But your solution is better.

    Thanks

    Regards

  • How to get an SVN backup cron working? said:

    […] trying to get the script here working on my Ubuntu Server to backup the SVN repository via a […]

  • Harold said:

    Hey guys, thanks for the information you shared herein. I was actually looking for ways to manually do full backup.

    @millibyte: thank you, you have a brilliant mind.

    @admin: thanks for this post, it’s helping me a lot.

    Harold
    My Blog: BWI Airport Parking Coupons

  • chiliemae said:

    Brilliant! That’s the right way to go. Nice blog you know what? I love the information you had, it really help me a lot. You made it look easy and you are on the right track. I’m happy to see you working on that. Now, I can figure it out. You haven’t missed a thing.
    Good work. You’re doing a right job,

  • David said:

    Awesome tips! It’s really important for me to backup my files since I have numerous important research files that I can’t live without. I will surely try your recommendations and see how it goes.

  • Sean said:

    Second script doesn’t work as on this site.

    Changed it to :

    #!/bin/bash

    BACKUP_DIR=”/Whatever/SVNBackupsWeekly”
    REPOSITORY_DIR=”/var/www/svn”

    ls -Al –time-style=long-iso $REPOSITORY_DIR | grep ‘^d’ | awk ‘{print $8}’ | while read line
    do
    if [ ! -d $BACKUP_DIR”/”$line ]; then
    mkdir -p -v $BACKUP_DIR”/”$line
    fi
    REVISION=`cat $REPOSITORY_DIR”/”$line”/db/current” | awk ‘{print $1}’`
    if [ -d $BACKUP_DIR”/”$line”/”$line”-“$REVISION ]; then
    echo “Skipping Backup as Backup Already Exists : “$line” revision “$REVISION
    else
    rm -rf $BACKUP_DIR”/”$line/*
    /usr/lib/subversion/tools/backup/hot-backup.py $REPOSITORY_DIR”/”$line $BACKUP_DIR”/”$line
    fi

    done

    The ” rm ” would not have worked ( syntax ) and would have always deleted backups, and

    The ” ls -Al ” command had /home/svn/repository/ hardcoded as opposed to $REPOSITORY_DIR

    Thanks a lot for the scripts though – Am fairly new to scripting and this page helped me get up to speed !

  • admin (author) said:

    Thanks for suggestions. On what distribution did you tried it ?

    Regards

  • Sean said:

    Hi – Tried on RHEL 5.5.

    Stuck at something else though ( not script related ).

    I recently migrated an existing Subversion installation to a new infrastructure and implemented automated ( daily ) backups as part of the migration.

    I ran “svn-backup-dumps –deltas -z -c 9999″ on the day of the migration to create full backups, then scheduled a daily backup as “svn-backup-dumps –deltas -z -c 1″ – The idea being that only new commits would be backed up with the first backup as base.

    What actually happened was that a backup file was created for every single revision ( new & existing ) so I guess the “-c” parameter is used to calculate whether a backup for a revision exists or not.

    Do you know of any way around this and do what I intended, i.e. a full backup with new commits only then incremantally added to that ?

    This is my first attempt at automated Subversion backup so excuse the bit of ignorance – and thanks again for your sripts !

    Sean

  • admin (author) said:

    Probably you don’t have 9999 revisions and the application is getting 1 instead. Look at how many revisions in in that SVN and use a smaller value.

    Regards

  • Sean said:

    Know I don’t have 9999 revisions – Most in a single repository is just under 3000. I want something like this :

    Initial backup file :
    ACT.000000-000803.svndmp.gz

    Subsequent commits :
    ACT.000804-000804.svndmp.gz
    ACT.000805-000805.svndmp.gz
    ACT.000806-000806.svndmp.gz
    .
    .
    .
    So I did initial backups using -c 9999, then changed the backup to -c 1 – I was hoping that would leave me with files like above.

    First run with -c 1 caused :
    ACT.000000-000000.svndmp.gz
    ACT.000001-000001.svndmp.gz – And so on for all revisions – ie initial backup file was completely ignored – I’d say because -c parameter is used to determine if backup needs to be done or not.

    Issue I have with using, for instance, -c 20 is that ( I think ) going to end up with something like this :

    ACT.000000-000007.svndmp.gz
    ACT.000000-000011.svndmp.gz
    ACT.000000-000015.svndmp.gz as commits are done – i.e. revision backups are duplicated, which I don’t want.

    Reasons for -c 9999 followed by -c 1 is that a) No duplicated backups, & b) minimal incremental backup every morning.

    Only alternative I see at the moment is to start with -c 1 but that leaves loads of files for each repo – Might have no alternative though.

    Thanks again

    Sean

  • How to get an SVN backup cron working? - Admins Goodies said:

    […] trying to get the script here working on my Ubuntu Server to backup the SVN repository via a […]

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.