Creating a custom Shrinker for a tree-like custom type

Enkidatron · February 9, 2021, 5:25pm

Hello everyone,

I am trying to create a custom Shrinker for a tree-like data type in order to use Fuzz.custom for my fuzz tests for this data type.

My current Fuzzers use Fuzz.oneOf and an Int to make sure that the tree will only go five layers deep. The tests are starting to really slow down a lot, particularly when a test fails, but even successful runs are taking much longer than I am used to.

Looking at the Shrink module, I believe that I understand what is going on, but I do not see how to actually create the Shrinker that I need. That is, a Shrinker is an alias for an a -> LazyList a function, but the LazyList type is an internal type that I cannot create directly. Thus, Shrinkers can only be created using the functions in the Shrink module. I do not see how to create a Shrinker for a custom type at all, without doing something extremely convoluted with the convert function.

I like the Tree example that is down in the “What are Shrinkers and why do we need them?” section of the module’s documentation. However, it does not explain how to actually write the Shrinker that is used in that example. Can anyone show me how to create a Shrinker for that Tree?

EDIT: I have an update about the test execution speed that caused me to want to do this, in case anyone is in a similar situation for a similar reason. When I added another branch to my convoluted tree-like data type, it crossed some threshold and could not complete the tests at all. They crashed after 20 minutes, due to running out of memory. Reducing the tree depth from 5 to 3 caused the tests to complete in ~15 seconds, which is good enough for me for now. While I would still like to know how to create a completely custom Shrinker, it is now a matter of getting better error messages when tests fail.

yosteden · February 10, 2021, 3:53am

This has to be a bug in the Elm compiler as well right? The Shrink module exports functions with a LazyList in their signature, but doesn’t actually export LazyList, so you can’t actually explicitly write the inferred type signature.

Enkidatron · February 10, 2021, 6:59pm

I actually do not think that this is a problem with the Elm compiler. I think that it is ok for a library to force us to use a public-facing type alias, because that feels conceptually similar to using an opaque type to me.
If I were trying to make a shrinker for a Point3d, I could do it with the functions that the Shrink module provides, and give it a Shrinker Point3d type signature even though the inferred signature is Point3d -> LazyList Point3d. This seems fine to me.
To approach it from another angle: if the Shrink library provided a function like fromEagerList : (a -> List a) -> Shrinker a, then I would be ok using that function to make my shrinker, even though the actual shrinker that I make has an inferred type signature of a -> LazyList a.

yosteden · February 10, 2021, 7:44pm

Well I was specifically thinking of Shrink.lazyList, which cannot be written only with the Shrinker type alias. LazyList must appear independently in the type. The best you can do is Shrinker a -> Shrinker (LazyList a).

More generally though isn’t this a perfect example of when you shouldn’t use a type alias and instead use an opaque type and not export the type constructor?

EDIT: “I think that it is ok for a library to force us to use a public-facing type alias, because that feels conceptually similar to using an opaque type to me.”

Ah I see, well I would just say that it’s not a good place to use a type alias in that scenario, but *shrugs*.

Enkidatron · February 10, 2021, 7:52pm

That’s a good point about Shrink.lazyList. I ignored that one as soon as I figured out it couldn’t help me make a tree shrinker. I definitely think that this module could have a better API, and that using a fully opaque type could improve the API.

I guess where I would need to think more deeply is whether the compiler can enforce ecosystem-wide rules that would make things better without disallowing useful things as well. Maybe it could!

However, even if this module were rewritten to only use a new opaque Shrinker a type and all of the functions used that in their signatures (and Shrink.lazyList went away), I would still not know how to make a Shrinker (Tree Int).

EDIT: To be clear, I think that the team behind this module also thinks that it could have a better API and has good plans to make things better. I think I saw that there is a plan to rename Shrinker to Simplifier and give it a better API in a future version. I think that this future API would use regular Lists instead of LazyLists, and thus would allow me to create my completely custom shrinkers. I am curious if there is a way to do this with the current API.

rupert · February 14, 2021, 11:04pm

It does seem overlooked doesn’t it? I accidentaly published a package that did not expose a type that it used in exposed functions. Fixed in latest version, but it was LetDeclaration in this:

https://package.elm-lang.org/packages/the-sett/elm-syntax-dsl/5.0.0/Elm-CodeGen

It feels as though the compiler could have checked that for me. In this case, LetDeclaration was actually an alias onto the one defined in stil4m/elm-syntax, which maybe reveals why the compiler does not/cannot check it easily - there is the possibility that the type is defined in another package. I suppose the compiler could walk the dependency tree and check if it is exposed there.

I also agree with you that the Shrink module is incomplete for handling custom cases, I ran into this issue once myself, but found some kind of compromise workaround for the particular problem I was interested in. I notice here, a comment that this is fixed on the master branch (July 2019).

github.com/elm-explorations/test

It should be easier to fuzz nested data structures.

opened 11:33PM - 11 Jul 19 UTC

zenhack

I'm in the midst of putting the finishing touches on a csv-encoding library befo…re releasing it, and as part of that want a fuzzer for this type: ```elm type alias Csv = { headers : List String , records : List (List String) } ``` The obvious way to do this would be something like: ```elm csvFuzzer = Fuzz.map2 Csv (Fuzz.list Fuzz.string) (Fuzz.list (Fuzz.list Fuzz.string)) ``` But this generates huge inputs, enough so that a simple test of "encode then decode gives the same result" takes >5 min on my laptop. So I went about trying to write a custom fuzzer that generates more reasonably sized inputs. I've got something written that shrinks `headers` using `Shrink.list Shrink.string` and something that shrinks `records` using `Shrink.list (Shrink.list Shrink.string)`, and I have no idea how to combine them. If the LazyList module were exposed I could just concatenate the lists, but I have no idea how to do this with the exposed API. There are a few possible takeaways from this experience: * There are some gaps in the API for combining shrinkers * You really shouldn't have to go to the trouble of writing custom generators and shrinkers just to get a smaller version of an existing fuzzer. Maybe something like #89 would help with this. * It's awkward that a shrinker returns a type that isn't actually exposed anywhere, and thus can't be worked with. What I would really have liked to write is something like this fuzzer: ```elm csvFuzzer = Fuzz.map2 Csv (Fuzz.list Fuzz.string) (Fuzz.list (Fuzz.smaller (Fuzz.list Fuzz.string)) ```

jfmengels · February 15, 2021, 9:49am

To avoid this problem in the future, you might want to use the NoMissingTypeExpose elm-review rule, which warns against this kind of problem.

Enkidatron · February 16, 2021, 5:11pm

Thank you for linking to that issue! It seems that the master branch does indeed expose a fromFunction function that would be what I need. Because the current release does not have equivalent of this function, I will have to wait for a new version of the library before I can make this fully custom fuzzer.

Regarding the unexposed type alias, I agree that this is usually a problem, and I think that it is neat that there is a tool like elm-review (which I just found out about) to automatically detect these problems. The reason that I don’t think that the compiler itself should be playing this role is that I don’t think the compiler can detect when you are deliberately using this pattern to make the type be opaque, such as the master branch of elm-test actually does:

{-| The simplifier type is opaque.
-}
type alias Simplifier a =
    Simplify.Internal.Simplifier a

As an aside, the same project that has this complicated tree-structure also uses Elm-CodeGen! Thank you for making and supporting it.

rupert · February 16, 2021, 5:16pm

I wonder if the compiler should allow that though? The problem is that if the type is not exposed at all, you cannot refer to it. Some type signatures become impossible to write:

import CodeGen

{-| Oh dear, I can't even write the type signature for this function!!! -}
createLetDecl : Blah -> Blah -> Blah -> CodeGen.LetDeclaration
createLetDecl _ _ _ = ...

But I can still write the function without its type signature.

Enkidatron · February 16, 2021, 5:54pm

I agree that the example you provide is indeed problematic. To fix it at the compiler level, though, would force the new elm-test, and any other library that deliberately uses a public-facing alias of an internal opaque type, to make the type opaque at the public-facing module level and adjust their functions to unwrap that opaque type. While this may only be a mild inconvenience, I don’t know how many packages this would apply to.

I think that it would also impact things like the Json package, where the Json.Decode module simply type-aliases the Json.Encode.Value opaque type in a way that I think we all find to be convenient and good.

What we are discussing is essentially making the compiler enforce the NoMissingTypeExpose rule for elm-review, so I will reference the “When to write or enable a rule” checklist from elm-review, specifically the “[ ] I have thought very hard about what the corner cases could be and what kind of patterns this would forbid that are actually okay, and they are acceptable.” and “[ ] I have communicated with my teammates and they all agree to enforce the rule.” items. Perhaps the inconvenience to some authors would be worth it!

Because there are some tradeoffs here, my personal predisposition it to leave this sort of things to tools like elm-review. Perhaps library authors should be encouraged to run a common “best-practice” set of rules before publishing, so they at least have to think about why they would want to disable a rule?

yosteden · February 18, 2021, 1:09am

But that’s a different case. Value is an opaque type, but is not itself a type alias, and type aliases to types which do not export their constructors is just fine. You can’t get yourself into a hole where the compiler can infer a type you cannot write.

system · February 28, 2021, 1:09am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elm-test-tables; a collection of useful elm-test extensions Show and Tell	1	821	June 4, 2018
Elm-minithesis: shrinking without compromises Show and Tell	27	2249	August 9, 2020
How to write a Fuzzer (List a) for a non-empty List Learn	21	1261	April 23, 2020
Create shrinker for fixed length string that does not shrink to an empty string Learn	4	800	December 23, 2019
String Fuzzer Library Request Feedback	6	671	December 1, 2018

Creating a custom Shrinker for a tree-like custom type

Related topics