Introduction

Working with REBOL is a process of discovery. One area I've enjoyed tinkering with is Parsing with Rebol's PARSE.

Parse-kit.r

PARSE-KIT.R defines some new functions I was inspired to write during my study of Rewrite and Match when I wondered if I could write a self contained recursive parse rule.

The script was developed with Rebol 2 which is what I continue to use, but will probably work with Rebol 3.

Functions to create parse rules

Each of these functions create new parse rules. The rules they produced can be used in combination, but note that some are not reentrant.

parsing-at

Creates a rule whereby a block is evaluated to determine the next input position.

The block should return the next input position if the rule should succeed, None or False if the rule should fail.

It's like adding your own input matching keyword:

odd: parsing-at x [if attempt [odd? x/1] [next x]] >> parse [1] [odd] == true >> parse [2] [odd] == false >> parse [x] [odd] == false

parsing-deep

Creates a rule to perform a recursive search for a pattern. Note that set-words are created as local variables to the rule by default.

This returns true:

parse [a [[x]]] parsing-deep ['x]

parsing-expression

Replace an expression with it's evaluation. Returns a parse rule.

Has a simpler replacement algorithm than parsing-rewrite, more suited to template style replacement.

Examples:

parse block: [now] parsing-expression 'now block == [17-Jun-2015/14:54:01+10:00]

parsing-rewrite

This is a parse only replacement for Rewrite.

Creates a rule which rewrites the input according to Patterns and Productions. Patterns are parse rules. Productions are compose blocks.

Example:

date-rule: parsing-rewrite [ ['time][(now/time)] ['date][(now/date)] ] block: [{Date is} date {time is } time] parse block date-rule block == ["Date is" 15-Jun-2015 "time is " 18:34:24]

parsing-to, parsing-thru

These take an arbitrary parse rule pattern and create a rule to implement a TO or THRU on it. Rebol 3 has a more powerful TO and THRU than Rebol 2 but still cannot take arbitrary rules (or perhaps that's a bug).

This returns true:

parse [a x 1] parsing-thru ['x integer!]

parsing-unless

Creates a simple NOT guard rule. Does not consume input.

This returns true:

not-x: parsing-unless ['x] not-y: parsing-unless ['y] parse [1] [not-x not-y skip]

parsing-when

Creates a simple guard rule. Use it when you want to test a condition and not advance input prior to testing your next rule.

Equivalent to the AND parse keyword in Rebol 3.

two-ints: parsing-when [2 integer!] parse [1 2] [two-ints 2 skip]

Parse trees

get-parse

I had previously written load-parse-tree but wanted something that wasn't so quite so heavy on stack usage for high frequency terms and I'm always playing with the structure of the resulting output. It's experimental.

GET-PARSE improves on load-parse-tree by:

allowing literals (constant) and terminals (variable length input) to be defined - this should be a bit faster and provide richer output.

outputting a tree with the same structure used by Gabriele Santilli's tree.r (from powermezz).

The structure of a node in the output is:

[node-type parent properties child1 child2 ... childn]

where:

node-type refers to the matched rule, terminal or literal

parent is a reference to location in the parent node where this child is referenced

properties is a block of key value pairs

Use this function to make this self-referential structure more understandable:

prettify-tree: funct [ {Prettifies a Powermezz tree [node-type parent properties child1 child2 ... childn].} tree [block!] ] [ if 4 <= length? tree [ children: at tree 4 new-line/all children true foreach node children [ if block? node [prettify-tree node] ; Allows some variation on tree nodes. ] ] tree ]

Sometimes you don't need the parent references and the structure is much simplified without them. Use this function to remove them:

remove-parents: funct [ {Remove parents from a Powermezz tree [node-type parent properties child1 child2 ... childn].} tree [block!] ] [ foreach node at tree 4 [remove-parents node] remove at tree 2 tree ]

It's easy enough to add the parent references back into the structure:

add-parents: funct [ {Modify structure [node-type properties child1 child2 ... childn] to restore parents to a Powermezz tree.} block [block!] /parent node [none! block!] {Specify parent node.} ] [ insert/only at block 2 node reference: at block 4 forall reference [ add-parents/parent reference/1 reference ] block ]

Simple block templating (Impose)

Impose allows some simple templating based on the idea of a custom reduce function where only targeted expressions identified by specific words are evaluated. It could be considered a simple tree rewrite function.

Example:

impose 'now [time is now] == [time is 17-Jun-2015/16:48:36+10:00]

Note:

All expressions identified by the symbol are replaced by impose.

The default evaluation function is do/next, but you can supply a custom evaluation function if your symbols cannot be evaluated by DO.

Multiple words can be handled and paths from the words are recognised:

a: 1 b: 2 c: 7 o: make object! [t: now] impose [a c o] [a + b c o/t/time] == [3 7 18:07:52]

Use a custom template function when you need to control the evaluation:

dialect: make object! [ points: funct [block [block!]] [ collect [foreach x block [keep compose/deep [li (:x)]]] ] process: funct [ input ] [ block: none parse input [ '+| 'points set block block! (block: points block) input: | 'blah... ] reduce [block input] ] ] block: [ p {List:} ul +| points [{Eeeny} {Meeny} {Miney} {Mo}] ] impose/func '+| block (get in dialect 'process) block == [ p "List:" ul li "Eeeny" li "Meeny" li "Miney" li "Mo" ]

Note: If your custom function returns one of the expression symbols as part of the output make sure the process will terminate or use the /only refinement which prevents repeated replacement.

Impose is experimental.