georges' blog

September 9, 2012

Keeping All Your Files Synced Across All Your Cloud Storage Providers

Filed under: Technology — kendall @ 6:03 pm
I’ve been checking out a number of cloud storage providers–quite a few actually–all together around a dozen. It is easy to say that not all cloud storage services are created equal. Not all have sync utilities. Not all have clients for both Apple OSX and Windows or for both iOS and Android. Not all clients allow you to define precisely the folders to be sync’ed or even allow for more than a single folder to be sync’ed…



A level playing field. I’m evaluating several of these and I want to, as much as possible, give them all an equal footing. I guess my main interest was that I wanted to make sure that they were all syncing the same folders and files–more or less automatically. The best way to do that would be for all of the clients to use the same folders as the sync folders. That way if I put a new file in a given folder, it would automatically be synchronized with all of the services simultaneously. The trick is that it is not trivial to direct all of the clients to sync the same folder. It’s not easy, at least not for all or even most of them. Of the services I’m checking out, only Bitcasa and SugarSync allow you to define the exact folders to be sync’ed–these two are probably the underdogs in this competition, though both have great products. The other four services in question are Dropbox, Box, Google Drive, and Microsoft’s SkyDrive–all much bigger players in the cloud storage space. But all of these want to create a folder in a default or user defined location for the cloud files. Left to their own devices these four services would create four folders in, for instance, your user or ~/Documents directory. While it would be easy enough for all four of these folders to be placed in the same location, this is still an unacceptable situation. In order to sync a single file to all four services, I would have to copy any given file to all four directories. And to delete the file from all four services, I would have to delete it from all four directories. It would be difficult to manually guarantee consistency across all four folders and also it would be difficult to judge the responsiveness of the service as not all folders would get the file or files at precisely the same moment. One could use rsync, a well-known Unix utility to keep these four folders in sync, but that does not address the problem that now you have four copies of the same file on your system, consuming your disk space that much faster. Uncool.

Enter symbolic links. Unix has a useful feature called symbolic links, or symlinks, that is designed to address this issue and this was my first strategy for solving this problem… I have a 25 GB partition I created on my Mac Book Pro’s hard drive for my cloud storage. I created this partition specifically for Microsoft’s SkyDrive. SkyDrive requires that your partition not be case-sensitive, for compatibility with Windows. I also made it 25 GB, the same as my SkyDrive quota, that way what ever fits on the partition should fit in the cloud. So, starting in my SkyDrive partition, which I have renamed CloudStorage, I created a hidden folder named /.sites. You will need to create this folder in Terminal as finder does not allow you to create files that start with a period. Within this folder I created a separate folder for each of my cloud storage services. In the root of my CloudStorage folder I created folders for Documents, Photos, Music, etc. that would be shared across all six services. I then created symlinks for each of these folders in each of the cloud services folders. I recommend using SymbolicLinker to do this as it takes the brain work out of getting the syntax correct and you also can create all of the symlinks simultaneously. For instructions on downloading, installing and using SymbolicLinker go to http://seiryu.home.comcast.net/~seiryu/symboliclinker.html.

At this point, a couple of lessons learned:

  1. Bitcasa and SugarSync can map to any folders, so mapping them to this separate folder and symlink structure was a waste of time and introduces complexity and a potential point of failure. It seems like both Bitcasa and SugarSync may have navigated symlinks, but again, why take the chance. It was unnecessary. So, I map these services directly to the actual folders and not symplinks
  2. Of the other four services, only Dropbox will follow symbolic links.

Halfway there. So, I still need a solution for the three remaining services which will not allow you to define the exact folders to sync and will not follow symlinks. At this point I spent too much time trying to get hard links to work. My advice to you, don’t bother with hard links. There is a lot of stuff on the web about hard links. I could not get them to work, at least I could not get them to work for folders, which is what I needed. This is the expected behavior. Hard links are not supposed to work for directories, so while they may (and should) work for files, this does not solve our problem. I also found a lot of blogs saying that symlinks would work for Google Drive and SkyDrive. While this may work in Windows, there are fundamental differences in the way Windows and OSX handle simlinks, and I could not get it to work for these or for Box.

Flip it. I had thought of reversing the folder structure… What I mean is, if SkyDrive, for instance, cannot navigate symlinks, then I will put the real folders in the SkyDrive folder and symlinks in the root folder. I will direct Dropbox’s symlinks to the real folders in the SkyDrive folder, and direct Bitcasa and SugarSync to sync the real folders in the SkyDrive directory. Several other bloggers had suggested the same thing. The problem with this is is that it only solves the problem for one of the three remaining services.

I then slept on the problem. I was really wanting to stick with SkyDrive as they offer the highest quota for free of any of the six services in question here. But at the same time, there is a lot of interest in Box at my work because of a great deal they offer to higher ed institutions through a partnership with Internet 2–so I didn’t want to cut them out of my evaluation. Then using Google Drive is important for its integration with Google Docs. So, I could not just write off any of the three remaining services. Besides I hate giving up. Then it came to me, literally in a dream… I would nest the folders of the three remaining services. While not neat or elegant, by putting the real folders inside the SkyDrive folder, which would be inside the Google Drive folder, which would be inside the Box folder, all of the files in the sync folders should get synced across all six services simultaneously. When I say not neat or elegant, what I mean is that at Google Drive, all of my folders are in a folder labeled SkyDrive. At Box my folders are two folders deep in Google Drive and SkyDrive folders. This is not ideal, but the important thing is that my files should have an equal opportunity to get synced at all six service providers.

In summary. This is what I did and you can too:

  1. Create a partition for your cloud files, using Disk Utility. I called mine [CloudStorage]. Make sure it is not case-sensitive for compatibility with SkyDrive.
  2. Create a hidden folder in the root of [CloudStorage] for your folders. I named mine “.sites”.
  3. Run box and define the sync folder as [CloudStorage]/.sites/
  4. Run Google Drive and define the sync folder as [CloudStorage]/.sites/Box Documents/
  5. Run SkyDrive and define the sync folder as [CloudStorage]/.sites/Box Documents/Google Drive/. SkyDrive should create your default folders. If not, create whatever default folders you’d like.
  6. Run Dropbox define the sync folder as [CloudStorage]/.sites/
  7. Create symlinks for the SkyDrive folders in the Dropbox folder and in the root of [CloudStorage].
  8. Define the sync folders for Bitcasa and SugarSync as the SkyDrive folders.

That’s it. If you place a file in any of the folders(symlinks) in the root of [CloudStorage] it should sync automatically to all six services. Likewise, if you delete a file from any of these folders, it will be removed from all six services. Similarly, if you upload a file or delete a file using a mobile client or a web interface for any of the six services it will be sync’ed to your local machine and all five other services, provided your computer is on and you are logged in. So, by simple act of placing a file in a folder, you can now back that file up six times over, and retrieve it immediately or later from six different places.

An interesting note. I was surprised to find that adding and deleting files using any method did not cause any problems. I thought that having so many services all checking the same folders for changes might create opportunities for one service to step on an other–to delete a file or overwrite a file–I have not seen that happen.

Update, October 24, 2012: I’ve done this with a couple more services, Pogoplug and CX, using these techniques. So’ve gotten up to eight different services sync’ing in this manner. Eight services working like this may be absurd–six may have already crossed that line. Practically, there may be a point where the services start stepping on each other though I haven’t seen that yet. The point is that you should be able to do this with the two, three or four services you prefer to use.

This is what the folder structure looks like.

The symlinks look like folders and the rest of the directory structure is hidden.

SymbolicLinker in action.

Powered by WordPress