Backing up a cpanel hosting account

Since 2005 I have hosted this web page in the Cpanel based Bluehost company. First with Joomla and recently migrated to WordPress.

Bluehost allows to download a daily, weekly and monthly backup from your Cpanel control panel, but manual intervention is needed:

  1. Logon in the control panel
  2. Navigate to the backup page
  3. Perform the backup
  4. Download it to your local computer.

This is a manually/time consuming task and of course you should not forget it!!

In this post I gonna show my automatic method to backup files and databases using:

  1. Crontab for automatic backups.
  2. Public/private keys for passwordless ssh connections.
  3. Rsync command for synchronizing directories between remote and local servers. This way bandwidth is reduced as if a file has already been copied to the local server no data transfer is needed.
  4. Mysqldump for dumping the MySQL databases to a local file.
  5. SpiderOak for data deduplication and remote backup.

Some previous knowledge is needed to understand how it works, anyway there are some useful links to understand it. :)

Let’s have a look to the script:

#!/bin/bash

# BackupCpanel.sh  Author:juan@elsotanillo.net
# http://www.elsotanillo.net/2011/09/backing-up-a-cpanel-hosting-account/
# Disclaimer: It works OK for my configuration. Please check carefully before using it in production environments

#uncomment for debug
set -x

# Defining some variables
MAILTO="email@domain.tld"
USERDB="YourUserDB"
PASSDB="YourDBPassword"
SSHUSER="YourSSHUser"
DOMAIN="domain.tld" # This is your main domain
REMOTE_PATH_BACKUP="/homeX/XXXXXXX"
LOCAL_PATH_BACKUP="$HOME/$DOMAIN"
LOCAL_MYSQL_PATH_BACKUP="$LOCAL_PATH_BACKUP/BackupDDBB/" # before using it as a DB backup folder it must be created
DB_NAME_BACKUP="BackupDDBB_`date +%Y-%m-%d`.sql"

## Checking ssh-agent is running and have valid identities already loaded
ssh-add -l
if [ $? = 1 ]
        then echo "Please add private key identities to the authentication agent and run it again"|mail -s "error in backup script" $MAILTO
        exit 1
fi ## no identities were loaded, so the script finished here as private/public ssh-keys are needed to remote logon

# Loading ssh-agent variables for private/public passwordless logon
/usr/bin/keychain
source  $HOME/.keychain/${HOSTNAME}-sh

### Remote msyqldump to a local file
ssh $SSHUSER@$DOMAIN "mysqldump -u$USERDB -p$PASSDB --all-databases" > $LOCAL_MYSQL_PATH_BACKUP$DB_NAME_BACKUP 

### Rsync between remote and local server
### We don't want to backup cache, session, mail and others directories,  so I "--exclude" them from the rsync command
###

rsync -avr --exclude 'mail/' \
           --exclude '.cpanel' \
           --exclude 'tmp' \
           --exclude 'BackupDDBB/' \
$SSHUSER@$DOMAIN:$REMOTE_PATH_BACKUP $LOCAL_PATH_BACKUP

# Run SpiderOak for deduplication and folder synchronization
SpiderOak --batchmode

And now let’s explain how it works:

  1. Some variables must to be defined depending your own configuration. MAILTO, USERDB, etc. These depends on your login name, ssh user, etc
  2. The script checks if the ssh-agent have valid identities already loaded. If not an email is send to MAILTO informing about the error and the script returns a 1 as a returning code.
  3. Runs the keychain and read some variables from $HOME/.keychain/${HOSTNAME}-sh file. Please read the Passwordless connections via OpenSSH using public key authentication, keychain and AgentForward. web page for more information.
  4. Backup using msyqldump is made in the LOCL_MYSQL_PATH_BACKUP local folder with all your mysql databases. The database dump file is not gziped as this makes the deduplication process useless. (How do I get the best backup deduplication from compressed files?).  Note: I don’t have any postgresql database. If you have any you will have to deal with it by yourself. But the same procedure can be applied with some modifications.
  5. At this moment passwordless ssh connections can be made between your computer and the remote server so we can proceed with the raw data in your remote /home/login directory. The rsync command is launched excluding some directories which contains no interesting data to be copied. When rsync finished all the files are copied to our local computer
  6. Now is turnto run the SpiderOak command (SpiderOak needs to be previously installed and configured). This will copy (using data deduplication) your pre-defined directories to the SpiderOak cloud. This way you will have at least 2 remote copies: your local PC and the SpiderOak cloud.

 Installing SpiderOak in your computer (the debian way)

  1. Create your SpiderOak account. They provide a 2 GBlifetime free account.
  2. install the client depending your O.S (They have clients for many O.S. windows,linux, MAC OS). For Debian add the following to your sources.list:        deb http://apt.spideroak.com/debian/ stable non-free
  3. apt-get update && apt-get install SpiderOak
  4. Run SpiderOak from a console to start the configuration GUI and customize it to adjust your needs: defining the directories you want to backup, you want to share, you want to sync with others computers, etc.

FAQs:

I don’t see any deduplication benefits here :evil:  Can you put an example?

Imagine you have 3 Joomla, 4 Mediawiki and 6 WordPress installations in your Bluehost account.

The benefits can be seen when the SpiderOak software copies the data between your server and its cloud network as all the installation files must be the same (if your are using same versions) and even if they are different versions as they must look similar. Data is transfered onces for same files and partially for similar files. This saves you a lot of bandwidth and space on your SpiderOak account.

SpiderOak uses de-duplication in a very advantageous way as it relates to your SpiderOak account

Today my SpiderOak account has 21.515 GB

Size of all stored files (without compression or deduplication): 107.638 GB

Also SpiderOak keep multiple versions of files:

“If a file is ever damaged or deleted or accidentally overwritten, you will always have the option of downloading an earlier undamaged version.”

I am a little bit concerned/paranoid with my backups. How can I have more security in my backups?

If for more security you understand more copies of your data in distant places…. you can use the SpiderOak’s SYNC feature.

In a standard configuration you can have several directories synchronized in severals server across multiple location using the standard SpiderOak’s client. Learn howto.

Share

2 thoughts on “Backing up a cpanel hosting account

  1. The script must reside in your local computer. That’s the reason you need the ssh-agent + keychainto connect to your Bluehost server remotely without password.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>