Making Rsync Faster

If you Google something along the lines of “make rsync faster”, the most common thing you’ll see is people saying “hey, I have a gigabit network connection and my rsync is crawling along at a hundred kilobytes per second.” Well, the issue here is not the network. Rsync needs time to analyze source and destination, generate checksums and compare timestamps, build a list of stuff to transfer and then, finally, start the copy process, one item at a time. You see the problem, I am sure.

The logical question that comes to mind: can I run multiple rsyncs in parallel. The quick answer is “no”: you give rsync source, destination and transfer parameters and you get what you get. But you can be more creative feeding this data to rsync. Here’s an example (and, for the sake of simplicity, both source and destination are NFS-mounted filesystems):

The source directory is /tmp/source NFS-mounted from some remote file server with the following contents:


As you can see, the first level contains a hundred folders. And each folder contains more subfolders and files. In all, more than 11,000 folders and 100,000 files. If you launch rsync with the most common options to sync source to destination (i.e. rsync -a /tmp/source/ /tmp/target/), you are unlikely to get very good throughput. Let’s time the process:


So, the whole thing took about six minutes. We can try launching a separate rsync for each of the one hundred first-level folders like so:


This will run multiple rsyncs in parallel – one for each folder, as defined by the “maxdepth” option for the “find” command. There are a couple of potential issues here. First, if you happen to have some files located above the “maxdepth” setting, your rsyncs will miss it. Second, having too many rsyncs running at the same time may simply kill your server. So we need to a) pick up any files located above the “maxdepth” level; and b) introduce some sort of flow control feature to keep the number of rsync threads in check.


And this is how you squeeze most performance out of rsync and maximize your available bandwidth. Your network admins will love you for this.

  • Mihai Cristian Satmarean says:

    I was looking for this for a million years!

  • Andre ten Bohmer says:

    Thanks! Boosted a cache partition copy from 40 minutes to just under 4 minutes.

  • Darío Fernández says:

    Thanks man! It works great and fast! You save me a lot of time of downtime :)

  • Roberto Bauco says:

    slight mod for compatibility for long ps

    while [ ps -efww | grep -ci rsync -gt ${maxthreads} ]

  • Ashok Kumar says:

    dose anyone have similer script for AIX server

  • Rare_ONE says:

    Igor, this script is batshit crazy. way too fast than the regular rsync running in multiple sessions..

  • clydevargas says:

    Great script! One thing to consider for anyone using this is that it might miss some things if you’re trying to keep the source and target identical, as you would with “rsync -a –delete”. For example, if a folder is deleted from /tmp/source it will remain on /tmp/target – Same goes for if a folder is added to /tmp/target – it will not get deleted on subsequent runs of the script.

  • magic wed says:

    Too bad if your destination server is an NFS4 server. Trond Myrtlebust’s rdirplus code patch to NFS4 will make your rsync remote listener take forever to produce a basic list of files on the destination server, and it gets even worse with high latency networks.

    To avoid Trond’s buggy code you got to avoid listing files on your destination NFS4 server.

    You could be better off just using a simple cp -rp command. Funny how you can transfer a file a few hundred megabytes in size in just seconds to and from an NFS4 server but to list a folder containing 50,000 files – forget it.

