Philip Monk ~wicdev-wisryt

Urbit is a principled approach to system design and programming, but it's often not obvious what those principles are. We have many specific terms for the different parts of the project: Arvo is the OS, Hoon is the language, Tlon is the company, Azimuth is the identity system, Ames is the network. But the word most associated with the project is "Urbit", and it's not clear what technically it refers to. It's only a small stretch to say that "Urbit" is this set of principles, and that if anybody follows these principles strictly they will create a system that is isomorphic to the Urbit.

Some of these are commonly held among many software projects, and some are not. Some are only debatably better than the alternatives, but Urbit exclusively chooses them.

We’ll start with a brief description of many principles, then we’ll go into a long-form justification.

Data is better than code. Store data in your state, send data over the wire, dispatch based on data.

Everything should be CQRS.

(Almost) Everything should be pubsub.

A subscriber shouldn't affect a publisher.

Communication between nodes should be communication between independent actors. Each message should do one complete thing, and there shouldn't need to be a sequence of coupled messages.

Represent your data as closely as possible to the essential structure of the problem.

A client's representation of data should be as close as possible to that of the server. This blurs the distinction between client and server. It allows offline-mode, reduces communication to syncing, and decentralizes.

When mating different paradigms, build one cleanly on top of the other. Never try to make them work on some of the same primitives. Never abuse one to make the other work. For example, ducts in themselves are very general - if you want to do pubsub, that can easily be built on top of ducts, but don't pretend that pubsub is a part of the duct system.

Never misuse an abstraction. An abstraction provides a certain set of tools; use them and only them.

Correctness is more important than performance.

Be simple and uncompromising in defining what's correct; go crazy with optimizations. Nock is a great example of this. It contains the character of the virtual machine, but its asymptotics are bad. Add jets to fix the asymptotics. Another example is the ACID nature of Arvo. Arvo is a pure function f(logs) of its event log, so formally Arvo is just a function run against an event log. A naive implementation has very bad asymptotics; processing each new event is O(n) in the number of historical events. Choose the function g(state,log) such that f(logs ++ log) = g(f(logs),log). Then, as long as you keep the state in memory, processing each new event is constant in the number of previous events. This still requires O(n) restart from disk, but you can also periodically (and non-blockingly) write a checkpoint of the state to disk, so that restart from disk is only linear in the number of events since the last checkpoint.

Correctness is more important than optimality.

If you don't completely understand your code and the semantics of all the code it depends on, your code is wrong.

Deterministic beats heuristic. Heuristics are evil and should only be used where determinism is infeasible, such as in cache reclamation.

Stateless is better than stateful.

Explicit state is better than implicit state.

Referential transparency is honesty and stability. Lack of referential transparency and other forms of disingenuousness are some of the world's big problems. Only deviate from referential transparency if absolutely necessary.

Responsibilities should be clearly separated. This applies from kernel modules through network citizens.

Dualities must be faced head-on and analyzed differently at different layers. Statically typed vs. dynamically typed, imperative vs. functional, code vs. data, and effectful vs. pure can all be a matter of perspective, and all relevant perspectives must have coherent answers.

One hundred lines of simplicity is better than twenty lines of complexity. It's not enough for an abstraction to reduce code duplication; it must actually make the code simpler.

Prefer mechanical simplicity to mathematical simplicity. Often mechanical simplicity and mathematical simplicity go together.

The Law of Leaky Abstractions is a lie; abstract airtightly. If your abstractions are leaking, it's not due to some law of the universe; you just suck at abstracting. Usually, you didn't specify the abstraction narrowly enough.