Transferring large files

Sometimes I have to move large files between servers. Most often they’re database backups, and since we have single database tables that can extend to many hundreds of gigabytes, whole database dumps can easily generate files of one or more terrabytes.

Even more challenging was a recent need to re-construct historical data from backup drives on a development workstation and then upload the results.

But here’s the wrinkle. Once file sizes move beyond 3-4GB, transfers seem to become a smidgen unreliable, whatever protocol I use.

The answer … a very useful *nix utility called split.

It works like this. Firstly compress the file using the compression technique of your choise, the only constraint being the ability to expand it again at the other end.

Then break the resulting binary into pieces. For example, suppose I have a large file that I’ve compressed into “sql_dump.sql.zip” in my home folder. Then I would run:


split --bytes=1G /home/peter/sql_dump.sql.zip /home/peter/out/sql

This will create files in the “out” folder, named with the prefix “sql” followed by an ordered alpha string, for instance sqlaa, sqlab, sqlac … sqlfg.

Each file will be 1GB in size, a file size that seems to work very reliably with both SFTP and rsync transfers.

Once they have all been successfully transferred, the compressed file can be reconstituted by concatenating the parts like this:


cat /home/peter/out/sql* > /home/peter/sql_dump.sql.zip

Though this sounds convoluted, flaky even, I’ve found it to be 100% reliable.

As is usual for *nix utilities, split can do so much more. To learn more, I recommend a visit to this reference.