I have been using pwentz/elm-pretty-printer but now run into a problem where it runs out of stack space when the ‘Doc’ that I am printing gets too long. The source code I am printing is only about 100 lines, so too long is a limit that is very quickly reached.
It is based on the Wadler pretty printer, which is an improved design over the Hughes one. It is tricky to port to an eager functional language of the ML family, such as Elm, as it is originally written for Haskell which is lazy. Some laziness needs to be introduced to keep it from using too much time/space. It also needs care to be tail recursive where possible.
Has anyone any experience with such pretty printers in Elm? Is there another library I can use that works better? Or can pwentz/elm-pretty-printer be repaired to a workable state?
I just wrote elm-pretty-printer a few months ago, so it is still a relatively green project and hasn’t yet been stress tested to handle inputs of that size as I wasn’t sure how people were going to use the library. Since you’re one of the first real users I’ve talked to about the pretty printer, I can definitely make laziness and tail recursion a top priority. It would be great if you had a reproducible example of how you’re using it so I can better identify where the bottlenecks are. Any help would be greatly appreciated. Thanks!
You should be able to run it with ‘elm-reactor’ then navigate to example/Main.elm
If you then uncomment line 42, to add in another method in the source model:
you will find that the stack overflows. I don’t think this is an accidental infinite loop, just too much recursion on the Doc model.
I am reading Wadlers paper now to see if I can get my head around it - Wadler and Hughes are both far too smart for mere mortals like me to easily comprehend.
I think the problem could be here, where you call nicest with 2 docs to choose between:
Both docs are fully calculated and passed as args to nicest. Nicest should really only evaluate the second one if the first does not fit. So these may need to be passed as continuations to make the nicest function lazy.
There may be other places that some laziness needs to be epxlicitly introduced.
The other problem is the ‘recur’ helper function that you have used to avoid giving all the parameters to each recursive invocation of ‘best’ each time.
I have no idea if other functional compilers can do it, but Elm is not smart enough to make 2 mutually recursive functions tail recursive, so the recursion here always eats stack space. There is a way it can be done in Elm using trampolines:
A simpler way to fix this is to inline the ‘recur’ function - this was actually the problem that was causing my code to overflow. Once this is done the ‘best’ function becomes nicely tail optimized - I checked the compiled output and it it was turned into a ‘while’ loop.
I will submit PRs for these 2 fixes, the lazy 2nd arg to ‘nicest’, and making ‘best’ tail recursive.
I am also tempted to publish a version of this pretty printer that excludes the ANSI console colors stuff. I don’t mean to criticize as I can see this code was ported from some Haskell code that already has this functionality in it - but it seems quite un-Elm-like for a package to target the ANSI console primarily, rather than HTML.
I am sure there is a way in which formatting/style can be introduced whilst keeping the pretty printer agnostic - style elements take up no character width on the console.
I definitely see what you’re saying here. Although the implementation of a function with a Doc -> Html a type signature comes with its own set of considerations and workarounds, I do agree that the Doc data structure needs to remain indifferent to whatever is being rendered. I will work on adding this to the next release (which will include the changes that you’ve submitted as well). I greatly appreciate your insight here.
With regards to your last comment, I’ve released some changes that you can check out here if you’d like. At this point, the Doc data structure and NormalForm are both uncoupled from the console, leaving the display function as an additional convenience.
This is a bit approximate, but how about something along these lines:
type Doc ctx
= ...
| Wrapper WrapFns ctx Doc
type alias WrapFns ctx =
{
wrap: ctx -> String -> (String, ctx)
unwrap: ctx -> String
}
The idea is that you can provide a set of ‘wrapper’ functions that modify the string output.
In the case of the ANSI console the ‘wrap’ function would place the ANSI control sequences to change color before the string to output. The ‘unwrap’ function would use the context to restore the old console colours outside of the wrapped section.
In the case of HTML the ‘wrap’ function would wrap the output in markup to change its color/style. The ‘unwrap’ function would not need to do anything, as the wrap function could take care of placing the output inside markup, unlike the console which needs more logic to restore the previous context; HTML is already nested, the console isn’t.
Then you would implement sets of wrap functions for color/style for the console or for HTML or whatever you want to output to. They would be completely separate to the pretty printer as the WrapFns type completely describes the interface needed between pretty printing and style.
As I say, a bit approximate so I may not have quite got the types right. The wrap functions would also need to be passed down into the normalized representation too.
I think I understand what you’re trying to say, but here are a few comments I have with your WrapFns suggestion.
How would you convert a string to styled Html when the wrap function returns a string? Are you planning on converting the Doc to a string and then converting that string to html?
The user is free to provide their own set of ‘wrapper’ functions where the NormalForm type is the input, this gives the user more control over how their Doc gets rendered without getting bogged down by the many data constructors on Doc. Actually, the NormalForm type exists solely for this purpose!
Don’t forget with the most recent release (v3.0.0), the Doc and NormalForm data types are decoupled from the ANSI terminal now, so if you’re looking for greater control when dealing with colors/styles/formatting (and to prevent the library from inserting ANSI color codes), then I’d encourage you to use the renderPretty function to convert your Doc to NormalForm, and then you can write your own NormalForm -> String function (try checking out the display function) and apply your own custom formatting function when pattern matching on the Formatted constructor:
Formatted formats sDoc ->
List.map myCustomFormatting formats
|> List.foldr (<|) (display sDoc)
myCustomFormatting : TextFormat -> (String -> String)
myCustomFormatting format =
case format of
WithBold -> ...
WithUnderline -> ...
WithColor docLayer color -> ...
Default -> ...
I suppose I could write a function like display, but with a type signature of NormalForm -> (TextFormat -> (String -> String)) -> String so that users can insert their own formatting function for finer control on that aspect of the rendering process, but I’d still like to keep a simple Doc -> String function exposed for convenience sake. Would introducing this address the issues that you’re concerned about?
How would you convert a string to styled Html when the wrap function returns a string? Are you planning on converting the Doc to a string and then converting that string to html?
I was thinking the String would contain HTML but just as a String, not elm-lang/html/Html. You would put this inside a <pre></pre> block. But yes, not the most elegant solution.