Unpatent this: endochronic backup
It seems like backup systems are back in the news - what with all the big players rushing to cash into the glut of cheap storage which is causing a glut of multimedia files being hoarded on hard drives at home by a glut of consumers who are now (or aught to be) wondering what they'll do if they lose the several hundred megabytes of whatever it is on their hard drive. The thing is that few of the really good backup software solutions are targeted at consumers. I'm using Dantz Retrospect which cost me $80 for up to three computers and it works, actually it works well. But it is for people who know what they are doing - I wouldn't hesitate to put it on someone elses system, set it up and let it tick over sending me (not them) and email if it fails.
The sad fact is that Windows XP has a pretty decent backup system included, but it does two things badly to the point where I would no longer install it on someone elses computer. Firstly it doesn't notify you when it fails, except by storing something in the system event log. I don't know about you, but I only ever look in the system event log when stuff goes wrong - which is usually when your machine may be in need of some backup recovery. Not a good time to also discover your backup has been failing for the past three months. Secondly its a pain to set up a decent backup regime that can get your files back quickly without doing multiple restores or eating up tons of disk space, yet also allows you to recover deleted stuff from several days, if not weeks back. Retrospect solved both of these problems for me - it optionally sends notificaitons by email, and it can do a rolling fast incremental backup as often as you like yet create complete "snapshots" of your disk each time that can be restored without having to restore a base complete backup place multiple incrementals. Basically you can fill your backup storage to the brim with incrementals and let Retrospect manage when to remove old data. That works for me just fine and really optimizes backup storage with the minimum of fuss.
Anyway, this entry wasn't really going to be all about my home backup scheme. Actually I wanted to tell you about my brilliant new super fast method for multimedia backup. Lets face it, many people have large chunks of disk filled with MP3 files, ripped DVDs and the like, maybe even some adult themed two dimensional byte arrays (otherwise knows as pr0n). So how about I create a backup service that works like this: doing a backup simply checksums all your files and if its never seen that checksum before it makes a copy to a remote location (or you burn a DVD, or write it to a spare disk and mail it to our backup service center). If it has seen that checksum before then no copying is necessary - there's already a remote copy. When you disk crashes you just tell me the checksums of all the files lost or damaged and my backup service will send you the replacement bit patterns.
Now there is a small, but finite chance the checksum thing may not work - the probability is really small if the checksum method is good (recently one was found to be broken - I forget if it was MD5 or SHA5). But if I send you the wrong bit pattern then I'll refund your backup service fees and pay a penalty - if you can prove the data didn't match your original. I'll be interested to know how you'll do that short of producing the original :-)
Also there's a chance that people with common files will find they accidentally loose their information quite often. Indeed if those files were say MP3s it would be convenient to checksum them, make a backup and then "lose and recover" on demand. I.e. this could be a loophole via which an offline music storage system could be built (possibly) legally. For instance if Joe has an MP3 called stairwaytoheaven.mp3 and checksums it, and I see that that checksum matches the checksum of data already sent me by Tom, well then I really don't need to be sent ANOTHER copy of the same bit pattern that produced that checksum - do I? Or if you do send it to me, maybe I don't actually need to save a second copy. Even my Retrospect software has that option to not save multiple copies of identical files - and checksums are a widely used method to do that identity comparison. Indeed if a user has a high speed Internet connection that can deliver a backup copy faster than it is played then why bother even keeping the file stored locally anyway? Just delete it and ask for the backup to be streamed and played on demand.
Now immediately you can imagine RIAA jumping down peoples throats for this. They didn't like MP3.com allow people to insert a CD into their machine and magically "save" it remotely for later streaming to them via the internet. All that was doing was a crude checksum based on track lengths. But where does backup and copying end. Surely someone wouldn't argue against backup being a fair use (except RIAA and MPAA), otherwise that would prevent me from ever backing up my drive that has all my legally downloaded or ripped songs (or other content). And surely no one is going to try and regulate how a remote, internet based backup service would work or insist it keep multiple copies of identical files. That just doesn't make sense.
So here goes - I launch my music file backup system. Get a few thousand users online and within weeks have made backups of several terrabytes of music data containing hundreds of thousdands of songs. Pretty soon most users can connect and find that a significant percentage of their music has already been "backed up" with no data transfer necessary. Lets call it endochronic backup - thats reverse time backup because the file backup is actually complete before you even started the backup. We'll also recommend that what bit rates you should rip your music in to guarantee maximum endochronic backup "compatibility", we'll also recommend you avoid ripping music to include DRM signatures that would minimize (well eliminate) your chance of an endochronic file transfer. We would probably even charge you extra for any file that actually needed a regular remote file transfer, unless later on we found that file was also common with another user - in which case we'd incrementally refund your initial fee based on how many other users possess the same file.
Other non-music uses are obvious. Say for instance you had a digital family photo album and had shared many of the photos with your family - you don't all need remote backups of the same files. If a large enough set of people are sharing the same files then the average cost per user would be tiny. Say there was a particularly striking hi-res image of someone in a revealing pose that had been widely downloaded by young males, lets say millions of them, and stashed onto hard drives the world over. Total file space used by that image storage, worldwide - terrabytes. Total space required to it up. Just one or two times the image size. Total cost to owners of said file to get access to that backup if they lose their copy - effectively zero.
Indeed such a system effecitvely already exists - most of the peer to peer sharing systems could easily implement it. Except those systems are based on the predicate of sharing bit patterns before you actually own an original version. The details of how that original was obtained should be irrelevant to the service offer. If you rent a space from a storage company and the put copies of copyrighted material in it, is that storage company to be blame for what you did? If you put illegally downloaded MP3 files on your hard drive, make a backup copy and put that copy in your bank's safe deposit box, are they to blame for providing that storage?
Really if you carry the DMCA to its logical conclusion you might say they were, since they are providing a mechanism that aides copyright infringement - under DMCA any storage system, physical or logical could be faulted, even plain old paper or ones own memory cells. Don't ever hum a tune you hear on the radio because RIAA might come after you for infringing DMCA... Indeed they might even go after god for creating humans that were clearly one big copyright infringement device waiting to happen.
I'm going to bet that its only a matter of time before someone tries to implement my endochronic backup system for at least some subset of file types. If the system was operated offshore in some country that thinks DMCA is a crock then there's a good chance it could go and stay live for some considerable period of time. With sufficient restrictions on how file checksums are obtained, verified and how backup copies are restored (e.g. not streamed on demand) there's even a chance it could be kept live legally indefinitely. It might not turn into the ideal music streaming service for consumers, but it could at least perform some useful function.


0 Comments:
Post a Comment
<< Home