22nd November 2009, 11:41 pm

The central question for me in designing software is always

What does it mean?

With functional programming, this question is especially crisp. For each data type I define, I want to have a precise and simple mathematical model. (For instance, my model for behavior is function-of-time, and my model of images is function-of-2D-space.) Every operation on the type is also given a meaning in terms of that semantic model.

This specification process, which is denotational semantics applied to data types, provides a basis for

correctness of the implementation,

user documentation free of implementation detail,

generating and proving properties, which can then be used in automated testing, and

evaluating and comparing the elegance and expressive power of design decisions.

For an example (2D images), some motivation of this process, and discussion, see Luke Palmer’s post Semantic Design. See also my posts on the idea and use of type class morphisms, which provide additional structure to denotational design.

In spring of 2008, I started working on a functional 3D library, FieldTrip. I’ve designed functional 3D libraries before as part of TBAG, ActiveVRML, and Fran. This time I wanted a semantics-based design, for all of the reasons given above. As always, I want a model that is

simple,

elegant, and

general.

For 3D, I also want the model to be GPU-friendly, i.e., to execute well on (modern) GPUs and to give access to their abilities.

I hadn’t thought of or heard a model that I was happy with, and so I didn’t have the sort of firm ground I like to stand on in working on FieldTrip. Last February, such a model occurred to me. I’ve had this blog post mostly written since then. Recently, I’ve been focused on functional 3D again for GPU-based rendering, and then Sean McDirmid posed a similar question, which got me thinking again.

Geometry

3D graphics involves a variety of concepts. Let’s start with 3D geometry, using a surface (rather than a solid) model.

Examples of 3D (surface) geometry include

the boundary (surface) of a solid box, sphere, or torus,

a filled triangle, rectangle, or circle,

a collection of geometry , and

a spatial transformation of geometry.

First model: set of geometric primitives

One model of geometry is a set of geometric primitives. In this model, union means set union, and spatial transformation means transforming all of the 3D points in all of the primitives in the set. Primitives contain infinitely (even uncountably) many points, so that’s a lot of transforming. Fortunately, we’re talking about what (semantics), and not how (implementation).

What is a geometric primitive?

We could say it’s a triangle, specified by three coordinates. After all, computer graphics reduces everything to sets of triangles. Oops — we’re confusing semantics and implementation. Tessellation approximates curved surfaces by sets of triangles but loses information in the process. I want a story that includes this approximation process but keeps it clearly distinct from semantically ideal curved surfaces. Then users can work with the ideal, simple semantics and rely on the implementation to perform intelligent, dynamic, view-dependent tessellation that adapts to available hardware resources.

Another model of geometric primitive is a function from 2D space to 3D space, i.e., the “parametric” representation of surfaces. Along with the function, we’ll probably want some means of describing the subset of 2D over which the surface is defined, so as to trim our surfaces. A simple formalization would be

type Surf = R2 -> Maybe R3

where

type R -- real numbers type R2 = (R,R) type R3 = (R,R,R)

For shading, we’ll also need normals, and possibly tangents & bitangents, We can get these features and more by including derivatives, either just first derivatives or all of them. See my posts on derivatives and paper Beautiful differentiation.

In addition to position and derivatives, each point on a primitive also has material properties, which determines how light is reflected by and transmitted through the surface at the point.

type Surf = R2 -> Maybe (R2 :> R3, Material)

where a :> b contains all derivatives (including zeroth) at a point of a function of type a->b . See Higher-dimensional, higher-order derivatives, functionally. We could perhaps also include derivatives of material properties:

type Surf = R2 :~> Maybe (R3, Material)

where a :~> b is the type of infinitely differentiable functions.

Combining geometry values

The union function gives one way to combine two geometry values. Another is morphing (interpolation) of positions and of material properties. What can the semantics of morphing be?

Morphing betwen two surfaces is easier to define. A surface is a function, so we can interpolate point-wise: given surfaces r and s , for each point p in parameter space, interpolate between (a) r at p and (b) s at p , which is what liftA2 (on functions) would suggest.

This definition works if we have a way to interpolate between Maybe values. If we use liftA2 again, now on Maybe values, then the Just / Nothing (and Nothing / Just ) cases will yield Nothing . Is this semantics desirable? As an example, consider a flat square surface with hole in the middle. One square has a small hole, and the other has a big hole. If the size of the hole corresponds to size of the portion of parameter space mapped to Nothing , then point-wise interpolation will always yield the larger hole, rather than interpolating between hole sizes. On the other hand, the two surfaces with holes might be Just over exactly the same set of parameters, with the function determining how much the Just space gets stretched.

One way to characterize this awkwardness of morphing is that the two functions (surfaces) might have different domains. This interpretation comes from seeing a -> Maybe b as encoding a function from a subset of a (i.e., a partial function on a ).

Even if we had a satisfactory way to combine surfaces (point-wise), how could we extend it to combining full geometry values, which can contain any number of surfaces? One idea is to model geometry as an structured collection of surfaces, e.g., a list. Then we could combine the collections element-wise. Again, we’d have to deal with the possibility that the collections do not match up.

Surface tuples

Let’s briefly return to a simpler model of surfaces:

type Surf = R2 -> R3

We could represent a collection of such surfaces as a structured collection, e.g., a list:

type Geometry = [Surf]

But then the type doesn’t capture the number of surfaces, leading to mismatches when combining geometry values point-wise.

Alternatively, we could make the number of surfaces explicit in the type, via tuples, possibly nested. For instance, two surfaces would have type (Surf,Surf) .

Interpolation in this model becomes very simple. A general interpolator works on vector spaces:

lerp :: VectorSpace v => v -> v -> Scalar v -> v lerp a b t = a ^+^ t*^(b ^-^ a)

or on affine spaces:

alerp :: (AffineSpace p, VectorSpace (Diff p)) => p -> p -> Scalar (Diff p) -> p alerp p p' s = p .+^ s*^(p' .-. p)

Both definitions are in the vector-space package. That package also includes VectorSpace and AffineSpace instances for both functions and tuples. These instances, together with instances for real values suffice to make (possibly nested) tuples of surfaces be vector spaces and affine spaces.

From products to sums

Function pairing admits some useful isomorphisms. One replaces a product with a product:

(a → b) × (a → c) ≅ a → (b × c)

Using this product/product isomorphism, we could replace tuples of surfaces with a single function from R2 to tuples of R3.

There is also a handy isomorphism that relates products to sums, in the context of functions:

(b → a) × (c → a) ≅ (b + c) → a

This second isomorphism lets us replace tuples of surfaces with a single “surface”, if we generalize the notion of surface to include domains more complex than R2.

In fact, these two isomorphisms are uncurried forms of the general and useful Haskell functions (&&&) and (|||) , defined on arrows:

(&&&) :: Arrow (~>) => (a ~> b) -> (a ~> c) -> (a ~> (b,c)) (|||) :: ArrowChoice (~>) => (a ~> c) -> (b ~> c) -> (Either a b ~> c)

Restricted to the function arrow, (|||) == either .

The second isomorphism, uncurry (|||) , has another benefit. Relaxing the domain type to allow sums opens the way to other domain variations as well. For instance, we can have types for triangular domains, shapes with holes, and other flavors of bounded and unbounded parameter spaces. All of these domains are two-dimensional, although they may result from several patches.

Our Geometry type now becomes parameterized:

type Geometry a = a -> (R3,Material)

The first isomorphism, uncurry (&&&) , is also useful in a geometric setting. Think of each component of the range type (here R3 and Material ) as a surface “attribute”. Then (&&&) merges two compatible geometries, including attributes from each. Attributes could include position (and derivatives) and shading-related material, as well as non-visual properties like temperature, elasticity, stickiness, etc.

With this flexibility in mind, Geometry gets a second type parameter, which is the range type. Now there’s nothing left of the Geometry type but general functions:

type Geometry = (->)

Recall that we’re looking for a semantics for 3D geometry. The type for Geometry might be abstract, with (->) being its semantic model. In that case, the model suggests that Geometry have all of the same type class instances that (->) (and its full or partial applications) has, including Monoid , Functor , Applicative , Monad , and Arrow . The semantics of these instances would be given by the corresponding instances for (->) . (See posts on type class morphisms and the paper Denotational design with type class morphisms.)

Or drop the notion of Geometry altogether and use functions directly.

Domains

I’m happy with the simplicity of geometry as functions. Functions fit the flexibility of programmable GPUs, and they provide simple, powerful & familiar notions of attribute merging ( (&&&) ) and union ( (|||) / either ).

The main question I’m left with: what are the domains?

One simple domain is a one-dimensional interval, say [-1,1].

Two useful domain building blocks are sum and product. I mentioned sum above, in connection with geometric union ( (|||) / either ) Product combines domains into higher-dimensional domains. For instance, the product of two 1D intervals is a 2D interval (axis-aligned filled rectangle), which is handy for some parametric surfaces.

What about other domains, e.g., triangular, or having one more holes? Or multi-way branching surfaces? Or unbounded?

One idea is to stitch together simple domains using sum. We don’t have to build any particular spatial shapes or sizes, since the “geometry” functions themselves yield the shape and size. For instance, a square region can be mapped to a triangular or even circular region. An infinite domain can be stitched together from infinitely many finite domains. Or it can be mapped to from a single finite domain. For instance, the function x -> x / abs (1-x) maps [-1,1] to [-∞,∞].

Alternatively, we could represent domains as typed predicates (characteristic functions). For instance, the closed interval [-1,1] would be x -> abs x <= 1 . Replacing abs with magnitude (for inner product spaces), generalizes this formulation to encompass [-1,1] (1D), a unit disk (2D), and a unit ball (3D).

I like the simple generality of the predicate approach, while I like how the pure type approach supports interpolation and other pointwise operations (via liftA2 etc).

Tessellation

I’ve intentionally formulated the graphics semantics over continuous space, which makes it resolution-independent and easy to compose. (This formulation is typical for 3D geometry and 2D vector graphics. The benefits of continuity apply to generally imagery and to animation/behavior.)

Graphics hardware specializes in finite collections of triangles. For rendering, curved surfaces have to be tessellated, i.e., approximated as collections of triangles. Desirable choice of tessellation depends on characteristics of the surface and of the view, as well as scene complexity and available CPU and GPU resources. Formulating geometry in its ideal curved form allows for automated analysis and choice of tessellation. For instance, since triangles are linear, the error of a triangle relative to the surface it approximates depends on how non-linear the surface is over the subset of its domain corresponding to the triangle. Using interval analysis and derivatives, non-linearity can be measured as a size bound on the second derivative or a range of first derivative. Error could also be analyzed in terms of the resulting image rather than the surface.

For a GPU-based implementation, one could tessellate dynamically, in a “geometry shader” or (I presume) in a more general framework like CUDA or OpenCL.

Abstractness

A denotational model is “fully abstract” when it equates observationally equivalent terms. The parametric model of surfaces is not fully abstract in that reparameterizing a surface yields a different function that looks the same as a surface. (Surface reparametrization alters the relationship between domain and range, while covering exactly the same surface, geometrically.) Properties that are independent of particular parametrization are called “geometric”, which I think corresponds to full abstraction (considering those properties as semantic functions).

What might a fully abstract (geometric) model for geometry be?

The central question for me in designing software is always What does it mean? With functional programming, this question is especially crisp. For each data type I define, I want...