Tuesday, November 20, 2012

How to throttle rsync

I was facing an issue where I needed to rsync a large set of small files (about half a million) from backend servers towards web fronts. That seemed simple but after a few test runs I noticed just how much load the rsync put onto the front-end, bringing the load up to 40 (on quad core VMs) so this was obviously bad.. The apache instances were suffering as a cause, connections piled up, things quickly got out of hand.

Solution? `ionice`!

Where `nice` is used for CPU scheduling `ionice` is used for io scheduling and together they can tame a massive rsync.

Here's what the final command line looked like (initiated from a backend server):
rsync --timeout=480 -z --compress-level=9 --rsync-path="nice -n19 ionice -c3 rsync" --recursive --delete-during --delete-excluded /local/path $REMOTE_SERVER:/remote/path
The magic here is in the --rsync-path parameter where we're defining the path on the remote server for rsync. Instead of using just rsync we're setting a nice'd and ionice'd rsync. Finally the -c3 parameter for ionice is stating that the io scheduling should only occur when the disk is considered idle as to avoid any blocking (especially important for the apache processes which are serving from disk!).

See more about ionice and nice.