Having trouble with stack overflow - What pretty printers are available for Elm?

rupert · December 15, 2017, 9:27pm

I have been using pwentz/elm-pretty-printer but now run into a problem where it runs out of stack space when the ‘Doc’ that I am printing gets too long. The source code I am printing is only about 100 lines, so too long is a limit that is very quickly reached.

It is based on the Wadler pretty printer, which is an improved design over the Hughes one. It is tricky to port to an eager functional language of the ML family, such as Elm, as it is originally written for Haskell which is lazy. Some laziness needs to be introduced to keep it from using too much time/space. It also needs care to be tail recursive where possible.

Has anyone any experience with such pretty printers in Elm? Is there another library I can use that works better? Or can pwentz/elm-pretty-printer be repaired to a workable state?

pwentz · December 16, 2017, 10:28am

Hey there,

I just wrote elm-pretty-printer a few months ago, so it is still a relatively green project and hasn’t yet been stress tested to handle inputs of that size as I wasn’t sure how people were going to use the library. Since you’re one of the first real users I’ve talked to about the pretty printer, I can definitely make laziness and tail recursion a top priority. It would be great if you had a reproducible example of how you’re using it so I can better identify where the bottlenecks are. Any help would be greatly appreciated. Thanks!

rupert · December 16, 2017, 2:55pm

Thanks.

The code I am trying it with can be found here:

You should be able to run it with ‘elm-reactor’ then navigate to example/Main.elm

If you then uncomment line 42, to add in another method in the source model:

github.com

the-sett/elm-java-source/blob/master/example/Main.elm#L42


            , implements [ "Serializable", "Cloneable" ]
            , annotate
                [ annotation "Component" []
                , annotation "Entity" []
                , namedQueries
                ]
            ]
            [ innerClass
            , staticField
            , mainMethod
            , staticInitBlock
            , methodArgsWithAnnotation
            , consWithArg
            ]
        ]


namedQueries =
    annotation "NamedQueries"
        [ annotationList
            [ annotation "NamedQuery"

you will find that the stack overflows. I don’t think this is an accidental infinite loop, just too much recursion on the Doc model.

I am reading Wadlers paper now to see if I can get my head around it - Wadler and Hughes are both far too smart for mere mortals like me to easily comprehend.

rupert · December 16, 2017, 3:09pm

I think the problem could be here, where you call nicest with 2 docs to choose between:

github.com

pwentz/elm-pretty-printer/blob/2.0.0/src/Doc.elm#L1224




FlatAlt doc1 _ ->
    recur indent currCol (Cons n doc1 documents)


Cat doc1 doc2 ->
    recur indent currCol (Cons n doc1 (Cons n doc2 documents))


Nest num doc_ ->
    recur indent currCol (Cons (num + n) doc_ documents)


Union doc1 doc2 ->
    nicest
        indent
        currCol
        (recur indent currCol (Cons n doc1 documents))
        (recur indent currCol (Cons n doc2 documents))


Column fn ->
    recur indent currCol (Cons n (fn currCol) documents)


Columns fn ->

Both docs are fully calculated and passed as args to nicest. Nicest should really only evaluate the second one if the first does not fit. So these may need to be passed as continuations to make the nicest function lazy.

There may be other places that some laziness needs to be epxlicitly introduced.

rupert · December 18, 2017, 10:21am

The other problem is the ‘recur’ helper function that you have used to avoid giving all the parameters to each recursive invocation of ‘best’ each time.

I have no idea if other functional compilers can do it, but Elm is not smart enough to make 2 mutually recursive functions tail recursive, so the recursion here always eats stack space. There is a way it can be done in Elm using trampolines:

http://package.elm-lang.org/packages/elm-lang/trampoline/latest

A simpler way to fix this is to inline the ‘recur’ function - this was actually the problem that was causing my code to overflow. Once this is done the ‘best’ function becomes nicely tail optimized - I checked the compiled output and it it was turned into a ‘while’ loop.

I will submit PRs for these 2 fixes, the lazy 2nd arg to ‘nicest’, and making ‘best’ tail recursive.

rupert · December 18, 2017, 10:40am

I am also tempted to publish a version of this pretty printer that excludes the ANSI console colors stuff. I don’t mean to criticize as I can see this code was ported from some Haskell code that already has this functionality in it - but it seems quite un-Elm-like for a package to target the ANSI console primarily, rather than HTML.

I am sure there is a way in which formatting/style can be introduced whilst keeping the pretty printer agnostic - style elements take up no character width on the console.

pwentz · December 18, 2017, 10:54pm

I definitely see what you’re saying here. Although the implementation of a function with a Doc -> Html a type signature comes with its own set of considerations and workarounds, I do agree that the Doc data structure needs to remain indifferent to whatever is being rendered. I will work on adding this to the next release (which will include the changes that you’ve submitted as well). I greatly appreciate your insight here.

pwentz · December 19, 2017, 1:51am

With regards to your last comment, I’ve released some changes that you can check out here if you’d like. At this point, the Doc data structure and NormalForm are both uncoupled from the console, leaving the display function as an additional convenience.

rupert · December 19, 2017, 10:07am

This is a bit approximate, but how about something along these lines:

type Doc ctx
    = ...
    | Wrapper WrapFns ctx Doc

type alias WrapFns ctx = 
  {
    wrap: ctx -> String -> (String, ctx)
    unwrap: ctx -> String
  }

The idea is that you can provide a set of ‘wrapper’ functions that modify the string output.

In the case of the ANSI console the ‘wrap’ function would place the ANSI control sequences to change color before the string to output. The ‘unwrap’ function would use the context to restore the old console colours outside of the wrapped section.

In the case of HTML the ‘wrap’ function would wrap the output in markup to change its color/style. The ‘unwrap’ function would not need to do anything, as the wrap function could take care of placing the output inside markup, unlike the console which needs more logic to restore the previous context; HTML is already nested, the console isn’t.

Then you would implement sets of wrap functions for color/style for the console or for HTML or whatever you want to output to. They would be completely separate to the pretty printer as the WrapFns type completely describes the interface needed between pretty printing and style.

As I say, a bit approximate so I may not have quite got the types right. The wrap functions would also need to be passed down into the normalized representation too.

pwentz · December 20, 2017, 12:53am

I think I understand what you’re trying to say, but here are a few comments I have with your WrapFns suggestion.

How would you convert a string to styled Html when the wrap function returns a string? Are you planning on converting the Doc to a string and then converting that string to html?
The user is free to provide their own set of ‘wrapper’ functions where the NormalForm type is the input, this gives the user more control over how their Doc gets rendered without getting bogged down by the many data constructors on Doc. Actually, the NormalForm type exists solely for this purpose!

Don’t forget with the most recent release (v3.0.0), the Doc and NormalForm data types are decoupled from the ANSI terminal now, so if you’re looking for greater control when dealing with colors/styles/formatting (and to prevent the library from inserting ANSI color codes), then I’d encourage you to use the renderPretty function to convert your Doc to NormalForm, and then you can write your own NormalForm -> String function (try checking out the display function) and apply your own custom formatting function when pattern matching on the Formatted constructor:

 Formatted formats sDoc ->
     List.map myCustomFormatting formats
         |> List.foldr (<|) (display sDoc)


myCustomFormatting : TextFormat -> (String -> String)
myCustomFormatting format =
    case format of
        WithBold -> ...
        WithUnderline -> ...
        WithColor docLayer color -> ...
        Default -> ...

I suppose I could write a function like display, but with a type signature of NormalForm -> (TextFormat -> (String -> String)) -> String so that users can insert their own formatting function for finer control on that aspect of the rendering process, but I’d still like to keep a simple Doc -> String function exposed for convenience sake. Would introducing this address the issues that you’re concerned about?

rupert · December 20, 2017, 3:03pm

How would you convert a string to styled Html when the wrap function returns a string? Are you planning on converting the Doc to a string and then converting that string to html?

I was thinking the String would contain HTML but just as a String, not elm-lang/html/Html. You would put this inside a <pre></pre> block. But yes, not the most elegant solution.

Topic		Replies	Views
Elm pretty printer invariant under elm-format (for code gen) Show and Tell	4	892	September 29, 2019
Elm Compiler in Elm - Update Show and Tell	10	1203	January 2, 2025
Lazy (rose) tree with zipper - pre-publish review Request Feedback	7	1396	January 4, 2018
Elmid: Elm compiler error reporter (in spirit of ghcid) Show and Tell	3	833	March 1, 2021
Announcing hindent-elm Show and Tell	5	752	March 19, 2021

Having trouble with stack overflow - What pretty printers are available for Elm?

Related topics