kvz.io
Published on

Synchronize Files With rsync

Authors
  • avatar
    Name
    Kevin van Zonneveld
    Twitter
    @kvz

Synchronizing files from one server to another is quite awesome. You can use it for backups, for keeping web servers in sync, and much more. It's fast and it doesn't take up as much bandwidth as normal copying would. And the best thing is, it can be done with only 1 command. Welcome to the wonderful world of rsync.

Installing rsync

On most modern Linux distributions you will find rsync comes preinstalled. If that's not the case, just install it with your package manager. On Ubuntu this would look like:

$ aptitude -y install rsync

done!

Simple - One Command

Let's copy our local /home/kevin/source to /home/kevin/destination which resides on the server: server.example.com:

$ rsync -az --progress --size-only /home/kevin/source/* server.example.com:/home/kevin/destination/

explained:

  • -a archive, preserves all attributes like recursive ownership, timestamps, etc
  • -z compress, saves bandwidth but is harder on your CPU so use it for slow/expensive connections only
  • --progress shows you the progress of all the files that are being synced
  • --size-only compare files based on their size instead of hashes (less CPU, so faster)

Note that this sync excludes hidden files since it uses the bash *. If you want to include hidden files, write the source like this: /home/kevin/source/ and remove the trailing slash from the destination like so: /home/kevin/destination.

Well, that's it! But read on if you want to learn how to automate this.

Advanced - Automatic Syncing With SSH Keys

Alright so syncing files on Linux is pretty easy. But what if we want to automate this? How can we avoid that rsync asks for a password every time?

There are different ways to go about this, but the one I mostly use is installing SSH keys. By installing your SSH key on the destination server, it will recognize you in the future and permit instant access. So this way we can automate the synchronization with rsync.

Easy Script

I've written another article explaining on setting up SSH keys. It also includes a script that can do all the work for you.

Did It Work?

Open a terminal and type:

$ ssh server.example.com

It should not ask you for any password. Great! this means we can also run rsync directly without logging in! If you need more in depth information on this, I wrote an article on logging in automatically with SSH keys.

Let's create a sync script

So now just create a script /root/bin/syncdata.bash

$ $EDITOR /root/bin/syncdata.bash

that contains your rsync command:

#!/usr/bin/env bash
rsync -az --delete /home/kevin/source/ server.example.com:/home/kevin/destination

Save the file and exit and make it executable like this:

$ chmod +x /root/bin/syncdata.bash

Schedule It to Run Every Hour

And to have your data synchronized every hour, open up your crontab editor:

$ crontab -e

And type

0 * * * * /root/bin/syncdata.bash

(if you need more in depth information on crontab I've written another article on scheduling tasks on linux using crontab)

That's it! New files are automatically updated @ server.example.com:/home/kevin/destination/ every hour. Files that are deleted from /home/kevin/source/* are also deleted at the destination, thanks to the --delete parameter.

Some Extra rsync Command Line Options

Some extra arguments that might come in handy customizing your synchronization job:

  • --delete delete files remotely that no longer exist locally
  • --dry-run show what would have been transferred, but do not transfer anything
  • --max-delete=10 don't delete more than 10 files in one run, safety precaution
  • --delay-updates put all updated files into place at transfer's end, very useful for live systems
  • --compress-level=9 explicitly set compression level 9. 0 disabled compression
  • --exclude-from=/root/sync_exclude specifies a /root/sync_exclude that contains exclude patterns (one per line). filenames matching these patterns will not be transferred
  • --bwlimit=1024 This option specifies a maximum transfer rate of 1024 kilobytes per second.

Pitfalls

  • Of course you should really be careful where and when to install SSH keys, because if one machine is compromised, it's very easy for a cracker to hop to the next system without logging in. So choose wisely when to use this technology. You might consider 'pulling' the files in from the backup machine if that one is less exposed. This way if your main machine gets hacked, they can't hop to your backup machine.
  • Keys are user specific. So if you're going to run programs as root that need to automatically login to systems, you must also install the key as root.

Legacy Comments (35)

These comments were imported from the previous blog system (Disqus).

Andrew
Andrew·

What if i have network drives mapped in my home dir (/home/andrew/remote/movies) and don\'t want them being copied?

Kevin
Kevin·

@ Andrew: Then you could use the exclude argument:
--exclude=\'/home/andrew/remote/movies\'

Or create a text file like /root/sync_exclude with all the things you want to exclude separated by newlines and then use:
--exclude-from=/root/sync_exclude

Jerry
Jerry·

Hi,

Can you provide some comment on this instruction

rsync -raz --progress --size-only --exclude=/home/[user]/test-rsync/exclude /home/[user]/test-rsync/* [servername]:/home/[user]/network/administrator/test-rsync/

How can I do a correct exclusion?

Thank you.

Kevin
Kevin·

@ Jerry: In this case your excluding every file/dir that starts with /home/[user]/test-rsync/exclude

So if that\'s what you want it should work.

Jerry
Jerry·

Hi Kelvin,

I am trying to synchronize

/home/jerry/test-rysnc/*

except this folder in

/home/jerry/test-rsync/exclude

which will include all its subfolders and files.

I have tried my command but still the folder \"exclude\" gets synchronized.

I am thinking could it be my badly written command or wrong parameter placement.

Thanks.

Kevin
Kevin·

maybe try adding an asterisk to the path you want to exclude like this: /home/jerry/test-rsync/exclude/*

Logan
Logan·

I\'m using this command on the OSX Terminal program and everything works except the --delete parameter. I\'ve tried with files and directories and it just doesn\'t want to delete them.

Kevin
Kevin·

@ Logan: Maybe a rights issue? Otherwise type rsync --help to see if the mac version even supports the --delete option. Maybe it\'s called differently, I don\'t really know mac.

bishop
bishop·

rsnapshot is a great tool for simplifying typical backup scenarios using rsync. With rsnapshot, you can create incremental backups, run remote programs before/after backups, etc.
http://www.rsnapshot.org/

Also, you can drop your key on the remote server using one command. As mentioned on http://www.bytejar.com/

([ -f .ssh/id_rsa.pub ]||ssh-keygen -t rsa) && ssh user@host \"([ -d .ssh ]||mkdir -m 700 .ssh) && cat>>.ssh/authorized_keys && chmod 600 .ssh/authorized_keys\" < .ssh/id_rsa.pub

# NOTES:
# replace user@host with the remote user and host you want
# the first time run, this will require you authenticate as user@host.
# subsequent times will be password free.
# this requires that you use a Bourne-derivative shell and GNU mkdir

komikers
komikers·

great article.. espesialy the automation..i want to give it a try later for my backup.
i love rsync too.. great toolls..

thanks

http://komikz.blogsite.org
Online Comics..

Steve
Steve·

The ssh installation section should explain how to send keys to a user@servername.example.com other than the root user.

Your script can do this because it allows input of a second argument (the user name) after the server name. I didn\'t realize this and used the user@servername.example.com single argument. It doesn\'t work because the script prepends \"root@\" thus trying to connect to root@user@servername.example.com

You don\'t mention the use of two arguments however.

Properly a user should enter:

servername.example.com username

I wrote about this also on the ssh page.

Kev van Zonneveld
Kev van Zonneveld·

@ Steve: Thanks again, I\'ve replaced the outdated text with a link to the other article to avoid duplication.

Jeff
Jeff·

Nice article-

One thing to note: the archive (-a) option for rsync includes the recursive (-r) option, so the added -r on the command line isn\'t necessary.

Kev van Zonneveld
Kev van Zonneveld·

@ Jeff: You are absolutely right and I have just forgotten to update the article. I will soon come with a new article on some unknown rsync options to make up ;) stay tuned!

RAJENDRAN
RAJENDRAN·

i want to sync data from my fileserver to my backup server through network Both also using Ubuntu

RAJENDRAN
RAJENDRAN·

i want to sync data from my fileserver (192.168.89.55) to my backup server (192.168.89.100) through network. Fileserver running ubuntu 6.06 server and backup server running ubuntu 8.04 LTS. So, kevin can u give the step to run the sync data for this.....coz i\'m new to LINUX and still learning on this.......

Kev van Zonneveld
Kev van Zonneveld·

@ Rajendran: Well it\'s all in the article, really. On .55 you do something like:

rsync -az /home/kevin/source/ 192.168.89.100:/home/kevin/destination

But if you really don\'t know what you\'re doing, maybe it\'s a bit too soon to be rsyncing.

Mario
Mario·

When using the -a option the -r option is unnecessary.
So -raz is the same as -az,

Kev van Zonneveld
Kev van Zonneveld·

@ Mario: You are right. Old article this one. Fixed tho, thanks.

Arie
Arie·

Hello Kevin,

I\'m new to all this rsync/linux/cron staff.
I\'ve been asked to keep two web servers (one master and the other replica) synchronized on a daily basis using rsync.

Can you please explain to me how can I automate rsync to execute once a day, every day of the week (or month) at the same time with an example please?

Thank you,

Kev van Zonneveld
Kev van Zonneveld·

@ Arie: That\'s the kind of advice you should pay for. You could have a look at my crontab article though.

nelson
nelson·

Thanks... Its great

r3z0
r3z0·

Thanks a lot ! Very good article...

Poul Anker
Poul Anker·

Excellent job!
Very good article. Well organized and well written.
Thank you very much for sharing.
Regards
Poul Anker

Mijo
Mijo·

Thanks a lot, easy to use , and implement... --delete seems not to work thouh

mb
mb·

A better automation option to crontab would be inotifywait for linux, and building the script around it.

Nessaja
Nessaja·

very good tutorial!!!!

Just 1 question,

Lets say the remote server doesn't allow ssh on port 22 but rather on port 2255, how do you specify the correct port for rsync to use?

Damir
Damir·

--delete does not work because you have to use it like:
rsync [OPTIONS] $SOURCEDIR/ $DESTDIR

so without the * for the source dir and without the '/' for the destination dir

ledong
ledong·

That's so helpful because i'm looling for a solution to auto-synchronize the file system between web servers. I'll try it. Thank so much

Muddles
Muddles·

Really useful piece. The problem of trying to sync without hashes for huge directories has flummoxed me for ages. Glad you've helped me find a way around it by comparing on size only.

Thanks.

hozza
hozza·

How do you know if your rsync fails or gives some kind of output?

Jin Kwon
Jin Kwon·

Great article!!
Please put some 'restarting crontab' for who want some immediate effect.

Mkv
Mkv·

From the man page, actually:
{CODE="text"]
--delete delete extraneous files from dest dirs
[/CODE]
So if you are doing a "pull" instead of "push" --delete should work at the local end.

Caifo
Caifo·

Thanks man! you saved my day! I'm migrating a mail server and this made moving the inboxes a lot easier.

Rody Curtis
Rody Curtis·

do you have a page for windows to windows synchronization through Cygwin and Rsync?