Has anyone thought about storing the raw files in git, perhaps on github?
The git protocol is pretty optimized, so pulling down a diff each day would take very little bandwidth. Offloading the files to github would also help your bandwidth bills too.
I fear it would make live more complicated for us to upload the dump daily. It would also increase our bills as we do not use all bandwith included in our hosting plans but would have to pay for github instead.