Codemod: PEG.JS – part 1

One of the goal of the OpenTextSummarizer project was to add tests and make the code a bit more customizable, while learning a bit more about the library. I decided that those tests would be written as recommended by some people whose advice struck me as very smart:

After some test code written, I started to realize that – while the tests were often simple – setting them up was taking a bit of time. So I started to look for ways to automate the test class.

Now there is no lack of unit test templating solutions; you could use Resharper, the Visual Studio item templates, T4… But I wanted to use something a bit different; a while ago I encoutered PEG.js, a parser generator javascript library that lets you handle a custom language and process it in your browser. Mhhhh, browser integrated parser, with all the rich possibilities of the javascript ecosystem at our fingertips? Let’s go!

So how shall we proceed? First, let’s say that we need a language that can express what we need in our tests. This language must then be parsed, and produce a structure that will somehow produce our test class. So we need some kind of templating, but integrating the template in the parsing may not prove the best idea; I’d prefer not mixing a structure and its representation if possible. Ok, let’s have the parser produce a JSON structure¬† which will then be used in a template engine.

Let’s talk tests

I want to express my tests according to the structure Phil Haack recommended, that is a test class usually matches the class we want to test. Each method in the class is matched by an internal subclass of the main test class whose behaviors are the test cases:

public class Test_class_name
{
    public class_name Target {get;set;}

    [SetUp]
    public void before_each_test()
    {
        Target = new class_name;
    }

    // Now each method has its own class with the system under test available through the Target property
    public class method_name: Testclass_name
    {
        [Test]
        public void cannot_be_called_twice()
        {
            // create your test here
        }
    }
}

After some fiddling, I decide that my test grammar must let me express:

  • The class that i will test
  • Its methods
  • The behaviors on these methods

So I start working on a PEG.js parser; a PEG works by refining the rules that compose the grammar. You can therefore say that a program is a collection of statements, that a statement is a string or an integer, and that an integer is a collection of numeric characters, a string a collection of characters. Here is what this kind of grammar could look like.

start = program
program = statement+
statement = string "\n"? / integer "\n"?
string = [a-z]+
integer = [0-9]+

The grammar can then completed by some javascript functions that will compute the output of the parser. It is possible to execute code when a rule is matched; the code will express the result of the parsing. With this in mind we can produce a JSON object based on our language.

To wrap up this first post, you can have a look at the grammar I ended up creating for the test description language. Next time we will take a look at what the parser produces and how we can use it to finally build our test class.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>