A conversation between the owners of ClubCompy about language design, syntax errors, and testing led to an interesting exchange (lightly edited for coherence):

How do you go about testing order of operations and languages? You need a minimal test driver that takes an expression or series of expressions and verifies that it either parses or produces a syntax error. The test matches part or all of that error. Any given input either parses correctly or produces an error. Our current test framework cannot "see" when there is a syntax error. We set a flag right before the end of our test programs and test that that flag has the right value. The most robust strategy I've seen is to add a parse-only stage to the compiler such that you feed it code and catch an exception or series of errors or get back a note that everything's okay. You can inspect a tree structure of some kind to verify that it has all of the leaves and branches you expect, but that's fragile and couples the internals of the parser and compiler and optimizer to the internals of your tests. Is having a huge battery of little code snippets that run or fail with errors the goal? Ideally there's as little distance between "Here's some language code" and "Here's my expected results" as possible. The less test scaffolding the better.

I've never been a fan of Behavior Driven Development. I think Ruby's Cucumber is a tower of silly faddishness in software development. (Any time your example walks you through by writing regular expressions to parse a subset of English to test the addition of two numbers, close the browser window and ask yourself if slinging coffee for a living is really such a bad idea after all.)

I neither want to maintain nor debug a big wad of cutesy code that exists to force my test suite into "reading like English"—as if the important feature of my test assertions were that they looked like index cards transcribed into code.

Nor do I want to spend my time tweaking a lot of hairy procedural scaffolding to juggle launching a compiler and poking around in its guts for a magic flag so that, a couple of dozen lines of code later, I can finally say yes or no that the 30 characters of line noise I sent to the compiler produced the error message I expected.

I want to write simple test code with minimal scaffolding to highlight the two important attributes of every test assertion:

Here's what I did

Here's what I expected to happen

That means I want to write something like:

parses_ok 'TOCODE i + 65', 'precedence of + should be lower than that of TOCODE';

Instead of:

Feature: Precedence of keywords and arithmetic operators In order to avoid parse errors between keywords and arithmetic operators As an expert on parse errors I want to demonstrate that keywords bind more tightly to their operands than do operators Scenario: TOCODE versus + Given code of "TOCODE i + 65" When I parse it Then the result should parse correctly without error

Which would you rather read, run, and debug?

All of these "DSLs for $foo" jump too far over the line and try to produce the end goal their users need to make for themselves. I don't want a project that attempts to allow me to write my tests in a pidgin form of English (and I get to parse that mess myself, oh joy, because I'm already testing a parser and the best way to test a parser is to write a custom fragile parser for natural language, because debugging that is clearly contributing to real business value).

Ideally, I want to use a library someone else has written that can launch my little compiler and check its results. I want to use this library in my own test suite and have it integrate with everything else in the test suite flawlessly. It should express no opinion about how I manage and arrange and design the entire test suite. It should neither own the world, nor interfere with other tests.

In short, if it has an opinion, it limits that opinion to just a couple of test assertions I can choose to use or not.

In other words, I still want Test::Builder because T::B lets me decide the abstractions I want or don't want and reuse them as I see fit. After all, good software development means building up the vocabulary and metaphors and abstractions appropriate to the problem you're solving, not adopting a hastily-generalized and overextended pidgin and trying to force your code into the shapes demanded.

If I'm going to have to write code to manage my tests anyway, I'll make the input and expected output prominent—not a boilerplate pattern of repetition I have to parse away anyhow.