Looking for a common point

Greetings!

Those days I am spending some of my time working on foundation parts for, revealing a possible surprise, a LDAP (Lightweight Directory Access Protocol) implementation for Perl 6.

However, it is yet too early to talk about this one, so I will have some mystery blanket covering this topic for now, as we have another one – spacecrafts!

And a common point between spacecrafts and LDAP is: LDAP specification uses a notation called ASN.1 , which allows one to define an abstract type, using a specific textual syntax, and, with a help of ASN.1 compilers, create a type definition for particular programming language and what’s more: encoder and decoder for values of this type, which can serialize your value into some data which, for example, can be send over network and parsed nicely on another computer.

This way you can get a cross-platform types in an application made easy. Encoders and decoders can be generated automagically not only for some specified encoding format, but for a whole range of binary (e.g. BER , PER and others) and textual (e.g. SOAP ) encoding formats.

So, in order to get things done, I had to implement at least some subset of ASN.1 in Perl 6 – not the full specification, which is big, and looking only at features used in LDAP specification.

‘This sounds interesting, but where are our spacecrafts!?’, you may ask. Turns out that Rocket type is the first thing you see at ASN.1 Playground website, which gives you free access to an ASN.1 compiler, which can be used as a reference!

ASN.1 and restrictions

Here is the fancy code:

World-Schema DEFINITIONS AUTOMATIC TAGS ::= BEGIN Rocket ::= SEQUENCE { name UTF8String (SIZE(1..16)), message UTF8String DEFAULT "Hello World" , fuel ENUMERATED {solid, liquid, gas}, speed CHOICE { mph INTEGER, kmph INTEGER } OPTIONAL, payload SEQUENCE OF UTF8String } END

Let’s quickly look over this definition:

Rocket is a SEQUENCE – a group of ordered values of some types, which can be seen as heterogeneous list/array or a class.

is a – a group of ordered values of some types, which can be seen as heterogeneous list/array or a class. Fields name and message have UTF8String type, which is, yes, one kind of string representation in ASN.1 . Field name has length restriction applied with (SIZE(1..16)) and message has default value specified with DEFAULT "Hello World" .

and have type, which is, yes, one kind of string representation in . Field has length restriction applied with and has default value specified with . Field fuel has ENUMERATED type: it is merely an enumeration of labels to choose from.

has type: it is merely an enumeration of labels to choose from. Field speed is a CHOICE , which is a special type that describes a field which value can be one of types specified. Differently from ENUMERATED , values are not just labels. OPTIONAL keyword means, as you can guess, that this field might be omitted if not present.

is a , which is a special type that describes a field which value can be one of types specified. Differently from , values are not just labels. keyword means, as you can guess, that this field might be omitted if not present. Field payload is a SEQUENCE again, but with a type specified. It means that we can have as many values of UTF8String s here as needed.

Here we will apply two important restrictions:

We will use Basic Encoding Rules ( BER ) – rules that specify encoding of ASN.1 types into a specific sequence of bytes. As said above, there are different formats, but we will use this one.

Basic Encoding Rules standard is based on a thing called “TLV encoding” – a value of a type is encoded as a sequence of bytes that represents: “Tag”, “Length” and “Value” of certain value of type passed. Let’s look at it more closely… in a reversed order!

“Value” is a part that contains a byte representation of a value. Every type has its own encoding schema ( INTEGER is encoded differently from UTF8String , for example).

“Length” is a number which represents number of bytes in “Value” part. This allows us to handle incremental parsing (and usual one too!) nicely. It also can have “unknown” value, which allows us to stream data with yet unknown length, but we will leave this aside.

“Tag” is, simply putting, a byte or a number of bytes using which we can determine what type we are having at hands. Its exact value is determined by number of tagging rules (“tagging schema”) and for good or worse different schemas exist.

And, if you have waited for a second restriction for some paragraphs already, here it is:

We will use BER’s IMPLICIT type tagging schema here. As you can guess, EXPLICIT tagging schema exists too, along with AUTOMATIC (which is used in the Rocket example above).

Considering this, we need to change ASN.1 type above into this:

World-Schema DEFINITIONS IMPLICIT TAGS ::= BEGIN Rocket ::= SEQUENCE { name UTF8String (SIZE(1..16)), message UTF8String DEFAULT "Hello World" , fuel ENUMERATED {solid, liquid, gas}, speed CHOICE { mph [0] INTEGER, kmph [1] INTEGER } OPTIONAL, payload SEQUENCE OF UTF8String } END

Note IMPLICIT TAGS is used instead of AUTOMATIC TAGS and [$n] -like strings in speed field.

If you look at this schema, it turns out that it is, actually, ambiguous, because mph and kmph both have INTEGER type. So if we have read an INTEGER from a byte stream, was it a mph value or a kmph value? It makes a huge difference if we are talking about spacecrafts!

To avoid this confusion, special tags are used and here we are specifying what ones we want, because, differently from AUTOMATIC schema, IMPLICIT does not do it for us.

Gradual building. Question answering.

So, what we can do with all that in Perl 6? While compilers may be fun, compiling into Perl 6, in an extensible manner, with fancy features included? There has to be something more simple to play with.

Let’s say, we have a script that works with spacecrafts. Of course, we will need a type to represent ones, particularly a class, let’s call it Rocket :

class Rocket {}

Of course, we want to know some data about it:

class Rocket { has $. name ; has $.message is default ( " Hello World " ); has $.fuel; has $.speed; has @ . payload; }

If we have to make our Rocket definition more clear on what is what, let’s specify some types:

enum Fuel < Solid Liquid Gas >; class Rocket { has Str $. name ; has Str $.message is default ( " Hello World " ); has Fuel $.fuel; has $.speed; has Str @ . payload; }

Now it starts to remind us of something…

Str is similar to UTF8String , except we cannot leave it like that, because in ASN.1 we have not only UTF8String , but also BIT STRING , OCTET STRING and other string types.

is similar to , except we cannot leave it like that, because in we have not only , but also , and other string types. Fuel enum is similar to ENUMERATED type.

enum is similar to type. Sigil @ of @.payload tells us it is going to be a sequence, and Str specifies type of its elements.

But while there are some similar points, there is not enough data for us from ASN.1 point of view. Let’s resolve those step by step!

How do we know that Rocket is an, at all, ASN.1 sequence type?

By applying a role: class Rocket does ASNSequence .

How do we know exact order of fields?

By implementing a stubbed method from this role: method ASN-order { <$!name $!message $!fuel $!speed @!payload> } .

How do we know that $.speed is optional?

Let’s just apply a trait on it! Traits allows us to execute a custom code on code parts and, particulary, Attribute s. For example, imaginary API can be like this: has $.speed is optional .

How do we know what $.speed is?

As CHOICE type is “special”, but still first-class one (e.g. you can make it recursive), we need a role here: ASNChoice comes to the rescue.

How do we know what type of ASN.1 string is our Str type?

Let’s just write that has Str $.name is UTF8String; .

How do we specify default value of a field?

While Perl 6 already has built-in is default trait, bad thing for us is that we cannot “nicely” detect it. So we have to introduce yet another custom trait that will serve our purposes and apply built-in trait too: has Str $.message is default-value("Hello World");

Let’s answer all those questions in a single pack:

role ASNSequence { #`[ Elves Special Magic Truly Happens Here ] } role ASNChoice { #`[ And even here ] } class SpeedChoice does ASNChoice { method ASN-choice() { # Description of: names, tags, types specificed by this CHOICE { mph => ( 0 => Int ), kmph => ( 1 => Int ) } } } class Rocket does ASNSequence { has Str $. name is UTF8String; has Str $.message is default-value( " Hello World " ) is UTF8String; has Fuel $.fuel; has SpeedChoice $.speed is optional; has Str @ . payload is UTF8String; method ASN-order { < $!name $!message $!fuel $!speed @!payload > } }

And a value might look something like:

my $rocket = Rocket . new ( name => ' Falcon ' , fuel => Solid, speed => SpeedChoice . new (( mph => 18000 )), payload => [ " Car " , " GPS " ]);

The more answers, the more questions

For this tiny example (which, on the other hand, has number of ASN.1 features demonstrated) this is all we need to, practically, use instances of this class in our application with possibly encoding and decoding it all we want.

So what elves secretly do with our data? Let’s find out in next post!