The data structure for the propoesd v8 CST is:
type Node a
= Node Range a
type alias Location =
{ row : Int
, column : Int
}
type alias Range =
{ start : Location
, end : Location
}
Which is perfect for describing the source code location of parts of the syntax, for purposes such as reporting errors and so on.
The one idea I had was to generalize this to:
type Node r a
= Node r a
Two reasons…
The first reason is that when doing codegen, you do not have a source file, so you do not know the locations to fill in. So instead you have to use a dummy value, Range.empty
instead. In this situation it might be preferable to codegen stuff with the type Node () a
. Or maybe make up your own empty type to signify that location information has not been added yet, such as:
type Generated = Generated -- placeholder type
doCodeGen : Model -> Node Generated FunctionOrValue
The second reason is that you might like to add more information, or alternative information than the standard Range
record provides. For example, I do codegen from JSON files that describe AWS services, to create the Elm stubs for calling them. In this case, I might like to provide a path into the JSON structure as the location, so that any error reporting can be tied back to the model that the codegen was run on:
type JsonPath = ...
doCodeGen : Model -> Node JsonPath FunctionOrValue
I think this would give the library less of a bias towards asuming that the CST always originates from a source file, which supports source → source type transformations well, to something that is unbiased as to how the AST/CST source model is created.
Another example use case… Suppose you wanted to calculate the Kolmogorov complexity for every function in a file. This would be a recursive algorithm that works from the leaf nodes up to the functions, calculating intermediate values along the way, then combining them as the larger expressions and functions are calculated. To support this, use an r
with space for this r == type alias LocationAndKolmogorov = { start : Range, end : Range, k : Int }
. So could also be useful for analytics.
What do you think?