For years now, Internet users have accepted the risk of files and content they share through various online services being subject to takedown requests based on the Digital Millennium Copyright Act (DMCA) and/or content-matching algorithms. But users have also gotten used to treating services like Dropbox as their own private, cloud-based file storage and sharing systems, facilitating direct person-to-person file transfer without having to worry.

This weekend, though, a small corner of the Internet exploded with concern that Dropbox was going too far, actually scanning users' private and directly peer-shared files for potential copyright issues. What's actually going on is a little more complicated than that, but it shows that sharing a file on Dropbox isn't always the same as sharing that file directly from your hard drive over something like e-mail or instant messenger.

The whole kerfuffle started yesterday evening, when one Darrell Whitelaw tweeted a picture of an error he received when trying to share a link to a Dropbox file via IM. The Dropbox webpage warned him and his friend that "certain files in this folder can't be shared due to a takedown request in accordance with the DMCA."

Whitelaw freely admits that the content he was sharing was a copyrighted video, but he still expressed surprise that Dropbox was apparently watching what he shared for copyright issues. "I treat [Dropbox] like my hard drive," he tweeted. "This shows it's not private, nor mine, even though I pay for it."

In response to follow-up questions from Ars, Whitelaw said the link he sent to his friend via IM was technically a public link and theoretically could have been shared more widely than the simple IM between friends. That said, he noted that the DMCA notice appeared on the Dropbox webpage "immediately" after the link was generated, suggesting that Dropbox was automatically checking shared files somehow to see if they were copyrighted material rather than waiting for a specific DMCA takedown request.

Dropbox did confirm to Ars that it checks publicly shared file links against hashes of other files that have been previously subject to successful DMCA requests. "We sometimes receive DMCA notices to remove links on copyright grounds," the company said in a statement provided to Ars. "When we receive these, we process them according to the law and disable the identified link. We have an automated system that then prevents other users from sharing the identical material using another Dropbox link. This is done by comparing file hashes."

Dropbox added that this comparison happens when a public link to your file is created and that "we don't look at the files in your private folders and are committed to keeping your stuff safe." The company wouldn't comment publicly on whether the same content-matching algorithm was run on files shared directly with other Dropbox users via the service's account-to-account sharing functions, but the wording of the statement suggests that this system only applies to publicly shared links.

We should be clear here that Dropbox hasn't removed the file from Whitelaw's account; they just closed off the option for him to share that file with others. In a tweeted response to Whitelaw, Dropbox Support said that "content removed under DMCA only affects share-links." Dropbox explains its copyright policy on a Help Center page that lays out the boilerplate: "you do not have the right to share files unless you own the copyright in them or have been given permission by the copyright owner to share them." The Help Center then directs users to its DMCA policy page.

Dropbox has also been making use of file hashing algorithms for a while now as a means of de-duplicating identical files stored across different users' accounts. That means that if I try to upload an identical copy of a 20GB movie file that has already been stored in someone else's Dropbox account, the service will simply give my account access to a version of that same file rather than allowing me to upload an identical version. This not only saves bandwidth on the user's end but significant storage space on Dropbox's end as well.

Some researchers have warned of security and privacy concerns based on these de-duplication efforts in the past, but the open source Dropship project attempted to bend the feature to users' advantage. By making use of the file hashing system, Dropship effectively tried to trick Dropbox into granting access to files on Dropbox's servers that the user didn't actually have access to. Dropbox has taken pains to stop this kind of "fake" file sharing through its service.

In any case, it seems a similar hashing effort is in place to make it easier for Dropbox to proactively check files shared through its servers for similarity to content previously blocked by a DMCA request. In this it's not too different from services like YouTube, which uses a robust ContentID system to automatically identify copyrighted material as soon as it's uploaded.

In this, both Dropbox and YouTube are simply responding to the legal environment they find themselves in. The DMCA requires companies that run sharing services to take reasonable measures to make sure that re-posting of copyrighted content doesn't occur after a legitimate DMCA notice has been issued. Whitelaw himself doesn't blame the service for taking these proactive steps, in fact. "This isn't a Dropbox problem," he told Ars via tweet. "They're just following the laws laid out for them. Was just surprised to see it."

Still, we feel this is important information for Dropbox users to know. There are certain limitations on how accounts can be used. Any Dropbox file shared via a "public link," even if it's a link that you only intend to share with a single person, is being compared against a database of previous material subject to the DMCA. It could be blocked on those grounds.