dimanche 31 juillet 2016

Dos and don'ts of testing a multi-step process

I have code which rewrites a context-free grammar into Chomsky Normal Form. I wish to test this, in particular with property testing. The code under test looks something like this:

chomsky_normal_form(cfg) do:
    intermediate_value_1 = step1(cfg)
    intermediate_value_2 = step2(intermediate_value_1)
    ...
    output step5(intermediate_value_4)

For each step there is a postcondition which should hold. There is also an invariant which should hold both before the first step and after each step. There is also a transition invariant which should hold for (cfg, iv1), for (iv1, iv2) etc. up to the last pair.

One approach to testing this is to write a method which asserts all of the postconditions and invariants.

One downside to this that I can spot is that I have to print out the failing values and the context of the test, instead of having the testing framework do some of that work for me.

It also appears to go against the spirit of property testing, but I can point to how exactly that would cause something undesirable to happen. Can you?

Another approach is to split this up into one test case (property) per aspect I want to test.

A downside is that there will be O(s^2) properties where s is the number of phases: one testing postcondition i after phase k for k=1..n and i=1..k. On top of this, n properties testing the transition invariants and n+1 testing the input-and-step invariant.

I can probably do this with code of length O(s), and without any duplication, but it seems very excessive even so.

To the extent I want to run many and/or large inputs through this gauntlet I would have to care about performance, and this looks like a lot of duplicated computations.

The question: what have you learned (from experience or otherwise) about the utility of either test approach? Is there some third approach I'm not seeing? Or is the answer just to suck it up and apply the elbow grease to test each step separately?

I think the best might be to have a framework which supports the first approach, in which I could accumulate a list of errors during the run, have the test fail if the list is non-empty, and have the framework present the list of errors in sensible ways; for property testing, it should also show contrasting pairs for the (non-)occurrence of each observed error. Does something like this exist? Am I right in wanting this, or would this be a bad idea for some reason I'm not seeing?

Practicalities: I'm writing this in Rust. Tool suggestions are welcome.

Irrelevant concrete details: the universal invariant I want to enforce is the welformedness of the representation of the context-free grammar (i.e. no dangling pointers). The post-conditions are that each step has done what it says on the tin, e.g. after the step which splits large right-hand-sides into smaller ones, all the right-hand sides are small. The step-transition invariant is that the language is preserved (which is undecidable, so I'll test strict preservation on finite languages and preservation of some sensitive features on more general language classes.)

Aucun commentaire:

Enregistrer un commentaire