Brightnets

The Owner-Free Filing system has often been described as the first9 10 brightnet; A distributed system where no one breaks the law, so no one need hide in the dark.

OFF is a highly connected peer-to-peer distributed file system. The unique feature of this system is that it stores all of its internal data as meaningless multi-use data blocks. In other words there is not a one to one mapping between a stored block and its use in a accessed file. Each stored block is simultaneously used to access many different files. Individually however, each block is nothing but arbitrary digital white noise.

Owner-Free refers both to the fact that nobody owns the system as a whole and nobody can own any of the data blocks stored in the system. The latter claim is explained below and supported in the appendix.

Multi-Use Data

It seems highly unlikely to most people that the same exact data can be used to represent several things at once. But indeed, the same digital representation can be, “both a floor wax and a dessert topping.” The reason is purely mathematical.

There are an infinite number of ways to digitally represent any given work.

Every digital representation can be used to perceive an infinite number of works.

The OFF System simply chooses one of the infinite ways that is non-copyrightable. (There must be at least one, or else every possible digital representation would already be copyrighted.) It just so happens that, in most cases, the same representation is already being used for other things. The process is detailed in the next section.

Copyright

No creative works, copyrighted or not, are ever communicated between OFF peers. Only meaningless blocks of arbitrary data. No tangible copies of creative works are ever stored on OFF peers. It is completely unnecessary.

All access to creative works is done exclusively on the user’s local machine. More importantly, no copies of a creative work ever need to be made. All works can be accessed in-place and on-demand by locally resident software in the same way that traditional file servers are used.

Storage of creative works is also done exclusively on the users local machine. Uniquely, only a single virtual copy of the creative work is made—directly into OFF’s virtual file server. That is all that is ever needed. As with a traditional file server, that single copy is completely private. It is accessible only through the URL possessed by the person who stored the work. The OFF storage process is logically equivalent to making a single copy to your personal disk or other network attached storage.

Should the person who stored the work want to allow access to others, he simply gives them the URL. The receiver need not make his own copy. He can access the work in-place. ITunes uses an identical process to enable friends to play each others music without copying—via streaming.

Hence our initial definition of Brightnet; “Nothing shady is going on.”

Ownerless Data

Divorcing data from meaning is usually the hardest leap for people to make. This section explains in common terms why it makes sense to do so with OFF. A more extensive treatment is done in the paper, “On copyrightable numbers with an application to the Gesetzklageproblem” which is attached as in appendix.

Numbers

A computer file is simply a number. Normally it is a really big number, but it is otherwise just like any other number. It is one more then the previous number and one less than the next.

We often think about it as a sequence of small numbers (bytes) or sometimes a sequence bits (ones and zeros). However, when you line these up in a sequence they form one big number.

Imagine lining up decimal digits. When you line up the sequence of decimal digits; Five followed by three followed by two, is interpreted to be 532 (Five hundred thirty-two). The same thing happens with binary numbers, but the numbers are usually much longer.

Small Numbers

Why is this important? Well for every number there are an infinite number of possible representations of that number.

Think of the number twelve (12). It can be represented as five plus seven (5+7), or twenty-five minus thirteen (25-13). In this case the meaning is not in the numbers but in the relationship between the numbers. Taken individually the numbers 5, 7, 13 and 25 are never 12. And they don’t in anyway contain 12.

If for some reason we were to allow 12 to be copyrighted by Britney, she would still have no claim on the numbers 5, 7, 13 and 25. I could still copy these numbers and pass them around as I saw fit. As long as I didn’t copy the number 12, I should have no problems with the law.

So what happens if I transmit the “formula” (5+7)? Am I allowed to do that? What about the formula (25-13)? What if I only transmit (5,7) or (25,13)? What is the “meaning” of these transmissions?

In the abstract, there is no way to know the meaning of any of these transmissions. The interpretation is purely up to the receiver. (5+7) might simply be arithmetic. On the other hand, the ‘+’ sign may not mean plus at all. It may only be a separator as in many web queries. (5,7) may mean a point in space, 57 or 5.7 or any number of possible other interpretations depending on who is looking at it.

There are many legitimate reasons to store or transmit the numbers 5 and 7. As such, the only possible one who can cause a law to be broken is the receiver. If the receiver reconstructs 12 from any transmitted numbers then perhaps the receiver has broken the law. But then again, perhaps not. If no copy of 12 is made then no copyright law can have been broken. To play a song is not to copy a song. No more then to play a VHS tape is to copy a VHS tape.

Big Numbers

So now lets translate these principles to big numbers. When we translate something into a computer file we create a sequence of digits that represents the original.

Lets take a song for example. Let’s say, “Lawyers, Guns and Money” is 3MB long. That means the song is three million bytes long or twenty-four million bits long. This makes a very big number, but it is still a number. As every binary number can be translated into a decimal number, I’ll use them to simplify these examples.

Picture the song as this, but much longer.

24332984303829732498…398724

Now there are two other numbers that may be of interest, depending upon how you interpret them. Consider the following big numbers:

11230243302314110327…264211

and

12102741001515622171…134513

Then consider adding them together. Again, there is meaning between the numbers but not in any part of the numbers.

Are these numbers copyrighted? Could they be stored on two separate computers? Would that break the law? What if they were never added together? Would their existence still break the law? What if I give you two other numbers? Again, and again…

It turns out these are not philosophical nor legal questions, but purely mathematical ones. There are two consistent ways to answer the above questions. One leads to the conclusion that “All numbers are already copyrighted.” The other leads to the conclusion that, “There exists encodings of copyrighted numbers that are NOT copyrighted.”

If the first conclusion is true, digital copyright is pointless. If the second is true digital copyright is meaningless.

Multi-Use Numbers

This is the idea at the core of the OFF System. The OFF System then takes it farther to show that each of these numbers can be used to access many different things simultaneously. Let’s name these numbers now, and add a couple more.

11230243302314110327…264211 = A 12102741001515622171…134513 = B 47379872610938161983…471179 = C 02810398720484003497…102380 = D

We showed above that (A+B) could represent, “Lawyers, Guns and Money”. Interestingly, at the same time (A+C) could represent, “Oops, I did it again!” Who then owns (A), Warren or Britney? Also (B+D) could represent, “Piano Man”. So who owns (B), Warren or Billy? Both (A & B) participate equally in multiple representations simultaneously.

Since no single person can lay claim to these numbers we call them “ownerless.” An appropriate claim, we think, for numbers which are intrinsically meaningless as well.

Arbitrary Data

The above process is exactly what the OFF System does, but instead of adding it uses another logical operation called XOR that simplifies the programming. Otherwise the logic is exactly the same.

Instead of working on whole files however, OFF works exclusively with fixed length “blocks” of data. Each block is exactly 128KB in size. If a file being stored is longer, it is broken into multiple 128KB blocks. If it is shorter, the blocks are padded to 128KB with random data.

These initial source blocks are never stored in the OFF System. Instead, OFF arbitrarily chooses relationships among new or existing blocks that happen to XOR back to the source block.

It stores any new blocks and reuses existing ones. We call each combination of blocks a “tuple.” There are always more possible tuples representations of a given source block, than there are possible OFF blocks. This allows us to arbitrarily choose tuples, or to optimize their choice as necessary.

To speed access, OFF spreads each arbitrary block to different servers around the internet. No fancy encryption is needed as each block has no intrinsic meaning. No anonymity is needed as only the accessor knows how the block is being interpreted.

Thus extending our definition of Brightnet; “No secrecy is needed. Nothing shady is going on.”

This work builds on previous technology pioneered by David Madore in his paper, “Method of free speech on the Internet: random pads” http://www.eleves.ens.fr:8080/home/madore/misc/freespeech.html

10 In additions we have since learned of two other previous academic projects that are similar in technology, Tangler, http://www.scs.cs.nyu.edu/tangler/; and Dagster, http://historical.ncstrl.org/tr/ps/rice_cs/TR01-380.ps.