Elm-minithesis: shrinking without compromises

Janiczek · July 23, 2020, 2:07pm

Hello! I’ve ported @'drmaciver’s Minithesis to Elm: elm-minithesis. (Phew, I’ve tried porting Hypothesis a few times over the years and never managed to do it, it’s very stateful.)

It’s a property based testing library similar to elm-explorations/test's Fuzz module, with the difference that it uses a shrinking method that has less disadvantages.

In short, it shrinks the underlying PRNG’s decisions instead of the generated values themselves, and that sidesteps the conventional shrinkers’ trouble with shrunk andThen-ed values no longer satisfying preconditions, since it instead generates the values from scratch each time.

So, elm-minithesis has Fuzz API much closer to the Random.Generator APIs; doesn’t expose any shrinking details to you the user, and lets you use andThen in your fuzzers.

I’m planning to add some adapters so that it can be used comfortably in elm-explorations/test, define some more fuzzers and then publish it.

See the test suite for usage examples.

Do you have any suggestions / questions / concerns? I’d like to get the 1.0.0 release as good as possible

gampleman · July 23, 2020, 2:40pm

Extremely cool. I’m looking forward to reading the source and getting some better understanding of the differences.

Are you planning on (eventually) contributing this to elm-exploration/test? I feel like having a standard testing package is pretty handy and this seems like an easier approach (for the user).

MartinS · July 23, 2020, 2:49pm

Looks cool! Is this approach also resistant against causing stack overflows?

Janiczek · July 23, 2020, 3:21pm

I still want to clean up and refactor the code after I get a bit of distance from it after the initial port At least some nicer separation into modules. But hopefully it should be easy to digest. Reading from the first commit forward helps too.

I would love to. Doing this as a separate library has had some differences in API as a result, since I can’t use JS kernel functions. So, for example, tests can’t be Test but must be Test a where a is the type of the value generated by the fuzzer. elm-explorations/test does some black magic with Elm.Kernel.Debug.toString so that it doesn’t have to

Also, so far I believe this approach to be superior but I (or somebody) will have to compare/benchmark them side by side a bit closer. Right now it’s all just big words without any numbers

I’ll create an issue in elm-explorations/test to start discussion about adopting this approach.

I believe andThen in principle can’t be totally safe against stack overflows; for example:

import Minithesis as M
import Minithesis.Fuzz as F

badFuzzer =
  F.unit
    |> F.andThen (\() -> badFuzzer)

M.run 0 <| M.test "" badFuzzer (\_ -> True)
--> RangeError: Maximum call stack size exceeded

But I believe I could include some trampolining / looping support like what elm/bytes or elm/parser have. (Maybe I’m conflating two unrelated ideas, didn’t look that much into it yet.)

MartinS · July 23, 2020, 3:26pm

Gotcha. I understand that recursively defined fuzzers can overflow if the user is careless. I was thinking more of elm-test’s tendency to cause stack overflows when met with even slightly complicated fuzzers (i.e. Fuzz.list (Fuzz.map2 Tuple.pair Fuzz.float Fuzz.float))

ianmackenzie · July 23, 2020, 3:28pm

Looks really cool! The only obvious feedback I have is that it would be great to flesh out the Fuzz module with support for more complex things like Floats and Chars and Strings.

I’d love to have a Float fuzzer with better shrinking behavior than elm-test - I think it currently just tries to make Float values as small as possible, which means I end up getting test failures in elm-geometry for things like “a point with coordinates (0.0000001, 0.00000003, -0.000000002)” which is pretty hard to read/reason about. It would be great to try to set things up so the shrinking tended to be more towards more “round” numbers with fewer decimal digits or something like that.

MartinS · July 23, 2020, 3:32pm

I was reading the paper on this and coincidentally came across this right after you mentioned how 0.000001 is less useful than 1.0

Janiczek · July 23, 2020, 3:36pm

Ah, I remember hitting that too myself.
The answer right now is “I don’t know”; I’ll have to test elm-minithesis more thoroughly and find its limits.

One thing I can say is, I deliberately followed case...of pattern instead of stuff |> Result.andThen |> Result.withDefault etc., so hopefully the library itself doesn’t make the situation much worse than the user-defined fuzzers already will. But, again, testing and measuring needed.

Oh yeah. Right now it’s basically just Ints, Bools, Lists I will add more fuzzers soon!

Hmm, interesting. Floats are tricky I don’t have any clear answers right now, I’ll probably ping you when I start playing with Float fuzzers and we can try and come up with something that would be useful to your failing tests?

(And yeah, that paper @MartinS linked is great!)

Janiczek · July 23, 2020, 3:46pm

Issue created!

harrysarson · July 23, 2020, 7:46pm

So, for example, tests can’t be Test but must be Test a where a is the type of the value generated by the fuzzer. elm-explorations/test does some black magic with Elm.Kernel.Debug.toString so that it doesn’t have to

I made a rough proposal a while ago that might be relevant here: Enable test runners to produce structured diff · Issue #147 · elm-explorations/test · GitHub. It would be be good to remove or atleast encapsulate that black magic a bit.

harrysarson · July 23, 2020, 8:27pm

Forgive me for asking a question before I have read about minithesis fully:

How does this shrinking method ensure that generating smaller random integers gives simpler test cases? Is it just by careful design of the Fuzz.xxxx functions?

Janiczek · July 23, 2020, 10:35pm

Yes, in general the shrinking is sensitive to how the Fuzzers are implemented. For example, two ways to write a List fuzzer would be:

1. Throw a coin to decide whether to generate another item

(Used in elm-minithesis right now)

list : Fuzzer a -> Fuzzer (List a)
list itemFuzzer =
  let
    go : List a -> Fuzzer (List a)
    go acc =
      Fuzz.weightedBool 0.9
        |> Fuzz.andThen (\bool ->
          if bool then
            itemFuzzer
              |> Fuzz.andThen (\item -> go (item :: acc))

          else
            Fuzz.constant (List.reverse acc)
        )
  in
  go []

2. Decide a list length beforehand

(didn’t try, and it has a different distribution and other problems, but is possible)

list : Fuzzer a -> Fuzzer (List a)
list itemFuzzer =
  let
    go : Int -> List a -> Fuzzer (List a)
    go leftToDo acc =
      if leftToDo <= 0 then
        Fuzz.constant (List.reverse acc)

      else
        itemFuzzer
          |> Fuzz.andThen (\item -> go (leftToDo - 1) (item :: acc))
  in
  Fuzz.int 0 20
    |> Fuzz.andThen (\length -> go length [])

The two approaches will have different representations for the same values:

Generated list value	PRNG for 1) Throw a coin	PRNG for 2) Decide length beforehand
`[]`	`[0]`	`[0]`
`[42]`	`[1, 42, 0]`	`[1, 42]`
`[42, 99]`	`[1, 42, 1, 99, 0]`	`[2, 42, 99]`
`[42, 99, 5]`	`[1, 42, 1, 99, 1, 5, 0]`	`[3, 42, 99, 5]`

And again, given shrinkers are generic and aren’t customized per fuzzer, some fuzzer strategies might shrink better than others.

As an anecdote, I’m currently reading how generating Floats works in Hypothesis. It’s intense. And cares more about how it shrinks than what the distribution is!

Janiczek · July 25, 2020, 12:11pm

Basic Float fuzzers are now live! By @ianmackenzie’s testing they shrink to nice readable values, so… I consider that a success

Janiczek · July 27, 2020, 8:17am

The main current blocker is:

porting the frequency fuzzer which in Hypothesis uses a fancy algorithm
fleshing out the floatWith fuzzer using that

After that, I believe we can start testing it, benchmarking and comparing it against elm-test - I’d love your help with that. I’ll write this up in more detail when the codebase is ready

gampleman · July 27, 2020, 11:27am

I wonder if the fancy algorithm is worth it, given that arrays are not exactly constant time in Elm.

Seems like some benchmarking in Python seems to indicate that it only becomes worthwhile with extremely large numbers of samples:

https://bugs.python.org/msg197540

mattpiz · July 29, 2020, 10:24am

Is it possible to try elm-minithesis for package tests before it gets published? I’d be interested in trying to fuzz sets of dependencies to see if some of my Debug.todo branches get triggered. How did you try it @ianmackenzie?

Janiczek · July 29, 2020, 10:28am

I feel like currently the easiest solution would be to clone elm-minithesis somewhere and add that path to your source-directories. I’m preparing a “call for testing and feedback” post where I’ll say how people can do this, but hopefully the above will unblock you?

From my conversations with Ian, I believe he has just written his test inside cloned elm-minithesis repo.

mattpiz · July 29, 2020, 11:18am

The issue is there is no source-directories for a package. But there are only 3 items in your src/ directory so I’ll try just symlink those

mattpiz · July 29, 2020, 11:55am

I can confirm that simlinks work. I’ve just written my first elm-minithesis test and got the that nicely shrunk failure.

✗ uncorrelatedDependencies

Given 0

    FailsWith [(("",Version { major = 0, minor = 0, patch = 0 }),("",Range []))]
    ╷
    │ Expect.equal
    ╵
    Passes

It took 1 full minute though on a 10th gen i7 CPU to fuzz these values. The fuzzer is built like this:

type alias Dependencies =
    List ( ( String, Version ), ( String, Range ) )


{-| Names of packages and their dependencies are not correlated,
so most probably, this will lead to an unsolvable set of dependencies.
-}
uncorrelatedDependencies : MF.Fuzzer Dependencies
uncorrelatedDependencies =
    MF.list (MF.map2 Tuple.pair packageVersion packageRange)


{-| Random package name and range
-}
packageRange : MF.Fuzzer ( String, Range )
packageRange =
    MF.map2 Tuple.pair name range


{-| Random package name and version
-}
packageVersion : MF.Fuzzer ( String, Version )
packageVersion =
    MF.map2 Tuple.pair name version


{-| String with 3 times less chance to be empty
-}
name : MF.Fuzzer String
name =
    MF.map3
        (\s1 s2 s3 ->
            if String.isEmpty s1 then
                if String.isEmpty s2 then
                    s3

                else
                    s2

            else
                s1
        )
        MF.string
        MF.string
        MF.string


range : MF.Fuzzer Range
range =
    MF.map2
        (\v1 v2 -> Range.between (Version.min v1 v2) (Version.max v1 v2))
        version
        version


version : MF.Fuzzer Version
version =
    MF.map3 Version.new_ positive positive positive


positive : MF.Fuzzer Int
positive =
    MF.int 0 100

Janiczek · July 29, 2020, 1:40pm

Great, thanks for sharing @mattpiz!

One thing that is probably not obvious enough / will need changing is that Minithesis.run by default tries to generate the value 1000 times or until it gets 100 generated values. Combine this with elm-test default --fuzz 100 and you potentially get up to 100k generation attempts for a single test. So that’s where your 1 minute run might come from. We might have to tweak the defaults a little if publishing functions for use with elm-test.

Looking at your fuzzers, they look fine to me - I’d perhaps use a different implementation for name, but it shouldn’t matter much (if shrinking is fine for that fuzzer).

I still have a few TODOs regarding what values do the fuzzers shrink towards: it seems negative floats are preferred, and when using anyNumericInt it leans towards the negative values also. This is all solvable, it’s just things I didn’t port from the Hypothesis repo yet

Topic		Replies	Views
Elm-minithesis: gathering feedback and benchmarks Request Feedback	7	967	August 13, 2020
Elm-explorations/test 2.0.0 released! Show and Tell	2	1027	October 22, 2022
Call for testing of elm-explorations/test 2.0.0 Request Feedback	6	1245	June 30, 2022
Elm-test-tables; a collection of useful elm-test extensions Show and Tell	1	786	June 4, 2018
Mutation testing in Elm Learn	2	461	September 25, 2022

Elm-minithesis: shrinking without compromises

1. Throw a coin to decide whether to generate another item

2. Decide a list length beforehand

Related Topics