Initially tokentree was an experiment to come up with a data structure that can hold a full Haxe file and provides an easy way to search for specific tokens. It went well enough that we started to build checkstyle checks with it and used it as a third engine for checks.
A little while after splitting tokentree off into its own repository, work on a Haxe formatter prototype started. Formatter version 1.0.0 was released on haxelib and with vshaxe[/edit] about 7 weeks after that.
I think it’s very safe to say that without tokentree there would be no formatter. As far as I know there have been multiple attempts by different people at writing one, but none went far enough for production code. And I think the main reason they failed or were abandoned was that prior to tokentree there was no data structure to hold a full Haxe file in memory - especially when that file also has conditional compilation sections or comments. And you will have a hard time formatting a file without being able to do it in memory, because you need a way to revert decisions about whitespace at different formatting stages and that window closes once you start writing to file.
Technically tokentree takes tokens from lexer and wraps each
Token instance with a
TokenTree instance, adding links to its parent, siblings and a list of childs to it. A parser-like collection of functions then tries to organise each
TokenTree instance into a tree structure, setting parent, siblings and childs accordingly.
As has been said the order in which tokens appear in the tree is not necessarily identical to how they appear in source code. Tokentree tries to place tokens in a way that they create some kind of meaning in regards to its use cases (mainly checkstyle and formatter).
Since every node in a tokentree is a
TokenTree instance, traversing it is very easy and can be done with a few lines of code. Furthermore having parent and sibling links allows navigation from any node to any other node.
On the other hand you are always looking at only one token, so if you need more context, if you want to know why that token is there or what it does, you need to look around and deduce its use and place from surrounding tokens. Which can be challenging or ambiguous.