Sunday, July 08, 2007


At the core mxTextTools is a state machine (tagging engine), written in C. The reason for introducing it was that in order to write a parser, you need a matcher (on a tokenizer and parser level). Writing matchers usually involves re, which is a pain in the butt. I like that it has a JIT compiler for the tagging commands. This seems to be a pretty powerfull replacment for the usual lexers/parsers.

Have a look at mxTextTools, it seems to be an interesting alternative to writing parsers using re/ebnf machines. I'm not sure it's really going to be easy to edit this list of assembly-like commands over a few bits of REs and a nicely written down ebnf.