Everybody loves ANTLR, but sometimes it may be overkill. On the other hand, a regular expression just doesn’t cut it or it may be too complicated to maintain. What a developer can do in such cases ? He uses Sprache. As its creators say:

Sprache is a simple, lightweight library for constructing parsers directly in C# code. It doesn’t compete with “industrial strength” language workbenches – it fits somewhere in between regular expressions and a full-featured toolset like ANTLR.

It is a simple but effective tool, whose main limitation is being character-based. In other words, it works on characters and not on tokens. The advantage is that you can work directly with code and you don’t have to use external tools to generate the parser.

The guessing game

You can see the project website if you want to see specific real uses, let’s just say that its even credited by ReSharper and it was created more than six years ago, so it’s stable and quite good. It’s ideal to manage things like error messages created by other tools that you have to deal with, to read logs, to parse queries like the ones you would uses for a simple search library or to read simple formats like Json. In this article we will create a parser for a simple guessing game, we will use .NET Core and xUnit for the unit tests, so it will work also on Linux and Mac.

The objective of the game is to guess a number, and to do that you can ask if the number is greater than a number, less than a number or between two numbers. When you are ready to guess you simply ask if it’s equal to a certain number.

Setup the project

We will use VSCode, instead of Visual Studio, but in the github project you would find two projects, one for each: this because there are still some compatibility quirks relative to project.json and the different .NET Core tools versions used by Visual Studio or the standalone command line version. To clarify, the project.json generated by the .NET Core standalone command line will work also with Visual Studio, but not viceversa (this might be changed when you will read this). Also, with two projects you can easily see how Visual Studio integrates xUnit tests. The C# code itself is the same.

{ "projects": [ "src", "test" ] }

Create the file global.json in the directory of your project, in our case SpracheGame , then create another SpracheGame folder inside src and a SpracheGame.Tests folder inside test . Inside the nested SpracheGame folder you can create a new .NET core program with the usual:

dotnet new

While you are inside the SpracheGame.Tests folder you can create a xUnit test project with:

dotnet new -t xunittest

You can see the final structure here.

Change both project.json , adding sprache as a dependency to the main project:

"dependencies": { "Sprache": "2.1.0" },

…and add the main project as a dependency for the xUnit test project.

"dependencies": { "System.Runtime.Serialization.Primitives": "4.3.0", "xunit": "2.1.0", "dotnet-test-xunit": "1.0.0-rc2-192208-24", "SpracheGame": { "target": "project" } },

If you are using Visual Studio you may need to add a runtimes section to both of your project.json:

"runtimes": { "win7-x64": {}, "win8-x64": {}, "win81-x64": {}, "win10-x64": {} }

See the .NET documentation for .NET Core Runtime IDentifier (RID) catalog if you need to know other platform IDs.

Create GameParser

Let’s start by creating a class called GameParser and by recognizing numbers and commands.

public class GameParser { public static Parser<string> Number = Parse.Digit.AtLeastOnce().Text().Token(); public static Parser<Command> Command = Parse.Char('<').Then(_ => Parse.Char('>')) .Return(SpracheGameCore.Command.Between) .Or(Parse.Char('<') .Return(SpracheGameCore.Command.Less)) .Or(Parse.Char('>') .Return(SpracheGameCore.Command.Greater)) .Or(Parse.Char('=') .Return(SpracheGameCore.Command.Equal));

On line 3 there is the code to parse a number: we start with Sprache.Parse followed by a digit, of which there must be at least one, then we convert from IEnumerable< char> to string, with Text(), and finally we discard whitespace with Token() . So first we choose the type of character we need, in this case Digit, then we set a quantity modifier and trasform the result in something more manageable. Notice that we return Parser< string > and not an int.

On the lines 5-6 we order to the parser to find a character ‘<‘ followed by one ‘>’, using Then() . We return an enum instead of a simple string. We can easily check for the presence of different options with the Or() , but it’s important to remember that, just as for ANTLR, the order matters. We have to put the more specific case first, otherwise it would match the generic one instead of reaching the correct case.

public static Parser<Play> Play = (from action in Command from value in Number select new Play(action, value, null)) .Or(from firstValue in Number from action in Command from secondValue in Number select new Play(action, firstValue, secondValue)); }

Now we have to combine this two simple parser in one Play , and thanks to the LINQ-like syntax the task is very simple. Most commands require only a number, but there is one that requires two, because we have to check if the number to guess is between two given numbers. It also has a different structure, first there is a number, then the symbol, and finally the second number. This is a more natural syntax for the user than using a ‘<>’ symbol followed by two numbers. As you can see, the code is quite simple, we gather the elements with from .. in .. and then we create a new object with select .

It’s time for Play

public class Play { readonly Command _command; readonly int _firstNumber; readonly int _secondNumber; public Play(Command command, string firstNumber, string secondNumber) { _command = command; if (!int.TryParse(firstNumber, out _firstNumber)) throw new ArgumentNullException("firstNumber"); if (secondNumber != null) { if (!int.TryParse(secondNumber, out _secondNumber)) throw new ArgumentNullException("secondNumber"); } } public Command Command { get { return _command; } } public int FirstNumber { get { return _firstNumber; } } public int SecondNumber { get { return _secondNumber; } } public bool Evaluate(int number) { bool result = false; switch (Command) { case Command.Greater: result = number > FirstNumber; break; case Command.Less: result = number < FirstNumber; break; case Command.Between: result = (number > FirstNumber) && (number < SecondNumber); break; case Command.Equal: result = number == FirstNumber; break; } return result; } }

The only interesting things in the Play class are on the lines 27-51, the Evaluate function, where the “magic” happens, and I use the term magic extremely loosely. The number to guess is provided to the function, then it’s properly checked with the command and the numbers of the specific play that we are evaluating.

Unit Tests are easy

There are basically no disadvantages in using xUnit for our unit tests: it’s compatible with many platforms, it’s still integrated with the Visual Studio Test Explorer and it also have a special feature: theory. Theory is a special kind of test that allow you to supply multiple inputs with one test. Lines 3-6 shows exactly how you can do it. In our case we are testing that our parser can parse numbers with many digits.

public class GameParserTests { [Theory] [InlineData("1")] [InlineData("10")] [InlineData("100")] public void CanParseNumbers(string value) { var number = GameParser.Number.Parse(value); Assert.Equal(number, value); } [Fact] public void CanParseGreaterCommand() { Command command = GameParser.Command.Parse(">"); Assert.Equal(command, Command.Greater); } [...] [Fact] public void FailParseWrongPlay() { Assert.Throws<ParseException>(() => GameParser.Play.Parse("> Number")); } }

The following test is a typical one, we are checking that the symbol ‘>’ is correctly parsed as a Command.Greater . On Line 27 we are making sure that an Exception is raised if we encounter an incorrect Play . Sprache allows also to use TryParse , instead of Parse , if you don’t want to throw an exception. As you can see the simplicity of tool make very easy to test it.

Let’s put everything together

static void Main(string[] args) { Random rand = new Random(); int numberToGuess = rand.Next(1, 100); bool finished = false; Console.WriteLine("******************************************************"); Console.WriteLine("* *"); Console.WriteLine("* Guess the number by asking questions *"); Console.WriteLine("* Use < X to ask if the number is less than X *"); Console.WriteLine("* Use > X to ask if the number is greater than X *"); Console.WriteLine("* Use X <> Y to ask if the number is between X and Y *"); Console.WriteLine("* Use = X to guess the number *"); Console.WriteLine("* Use q to quit *"); Console.WriteLine("* *"); Console.WriteLine("******************************************************"); while (!finished) { try { var input = Console.ReadLine(); if (input.Trim() == "q") finished = true; else { Play play = GameParser.Play.Parse(input); bool result = play.Evaluate(numberToGuess); Console.WriteLine(result); if (play.Command == Command.Equal && result == true) { Console.WriteLine("You guessed right."); finished = true; } } } catch (ParseException ex) { Console.WriteLine("There was an error: {0}", ex.Message); } Console.WriteLine(); } }

The main function doesn’t contain anything shocking, on the lines 27-28 we parse the input and execute the proper command, then, on 31, we check whether we guessed the correct number and if so we prepare to exit the cycle. Notice that we provide a way to exit the game even without guessing the number correctly, but we check for ‘q’ before trying to parse, because it would be an illegal command for GameParser .

Conclusions

This blog talks much about Language Engineering, which is a fascinating topic, but it is not always used in the everyday life of the average developer. Sprache, instead, is one tool that any developer could find a use for. When a RegEx wasn’t good enough you probably have simply redesigned your application, making your life more complicated. Now you don’t need to, when you meet the mortal enemy of regular expressions, that is to say nested expression, you can just use Sprache, right in your code.