Elm-minithesis: gathering feedback and benchmarks

Janiczek · July 30, 2020, 1:50pm

Hello folks!

A week ago I posted about elm-minithesis.

I believe it’s roughly ready for publishing / for larger discussions about inclusion to elm-test, but I’d like to first

gather feedback on the API
benchmark against elm-test
compare behaviour against against elm-test

For that, I need your help! Let’s talk about each point:

1. API feedback

Here are the preview docs (thanks @jfmengels!)

Could you please read it through and tell me your suggestions for wording and even API design? Eg. there are differences from elm-test Fuzz module like:

-- elm-test:
intRange : Int -> Int -> Fuzzer Int
int : Fuzzer Int

-- elm-minithesis:
int : Int -> Int -> Fuzzer Int
anyNumericInt : Fuzzer Int

Could you also try using the library - rewriting some of your existing fuzz tests in this style and telling me what was surprising / unexpected / confusing? (Also perhaps contributing them to the benchmarks, see #2) (You’ll need to vendor the elm-minithesis source code / add it to source-directories since it’s not published yet.)

2. Benchmark against `elm-test`

I’ve created an example benchmark and a template for writing your own:

main =
    ourBenchmark
        { name = "int 0 10000"
        , minithesisFuzzer = MF.int 0 10000
        , elmTestFuzzer = F.intRange 0 10000
        , minithesisFn = \i -> i < 5000
        , elmTestFn = \i -> Expect.lessThan 5000 i
        }

I’d like to benchmark various fuzzers and how they behave with various test functions. Perhaps we find out that frequency is much slower in elm-minithesis and needs to be optimized, or something similar.

Also, (real-world) combinations of fuzzers would be helpful.

3. Compare behaviour against `elm-test`

There are differences:

elm-minithesis stops doing extra work after it finds a failing example and shrinks it fully; I suspect elm-test finishes all 100 runs even if it already found and shrunk an conterexample. If true this probably makes the benchmarks a little bit apples-to-oranges. The elm-minithesis behavour makes sense to me though.
There might be differences in distributions of floats, lists, etc. - I hope some of your testing could uncover bugs / issues, like:
- “I’d expect it to find a counterexample for XYZ but it never did!”
- “With these frequency weights my elm-test test never triggered stack overflow, but elm-minithesis does!”
Unexpected shrink “targets”: eg. I know of a bug where int -5000 5000 will shrink towards -5000 instead of 0. Are there more?

I’ll be very glad for any feedback you give me on this.

I’m in touch with @drathier about possible integration of this into elm-test, and the benchmarks etc. are an important steps before we can decide that in any more detail.

Thanks and stay safe, folks!

gampleman · July 31, 2020, 3:09pm

Regarding API differences: I think if the idea is to get to eventually replace the fuzzers in elm-test, than having the same API would be pretty neat, as it could be a more or less drop in upgrade.

(Of course there will be some differences like custom going away and andThen coming back in, but I suspect that the vast majority of elm-test users don’t actually use custom).

Janiczek · July 31, 2020, 3:56pm

Right, I agree. If this is to be integrated into elm-test then the published surface API should be drop-in (unless maintainers had some changes planned for a major version release which could then be snuck in as this would definitely be a major version).

I also feel like in case of publishing outside of elm-test I’d want to create a better API from ground up since I don’t have legacy users, but … that would probably hamper migration from elm-test fuzzers a bit. Will have to mull that over a bit

mattpiz · August 1, 2020, 8:40am

Regarding API, I just found myself needing a function of type List (Fuzzer a) -> Fuzzer (List a) so I created like follows. Is this friendly with the shrinking mechanism of elm-minithesis?

fromList : List (MF.Fuzzer a) -> MF.Fuzzer (List a)
fromList list =
    List.foldr (MF.map2 (::)) (MF.constant []) list

I arrived in this situation because I want to generate correlated lists of dependencies, i.e. the package names in the dependencies are the same that those available:

type alias Dependencies =
    List ( ( String, Version ), List ( String, Range ) )

{-| Names of packages are the same that those of dependencies.
-}
correlatedDependencies : MF.Fuzzer Dependencies
correlatedDependencies =
    MF.list name
        |> MF.andThen
            (\packages ->
                fromList <|
                    List.map
                        (\p ->
                            MF.tuple
                                (MF.tuple (MF.constant p) version)
                                (Debug.todo "List of dependencies, use names from `packages`")
                        )
                        packages
            )

Is there a better way to do this that do not involve the fromList function?

mattpiz · August 1, 2020, 8:52am

By the way, without the docs.json file, the docs preview does not show the API doc

Janiczek · August 2, 2020, 8:26am

Thanks, I put it back in!

I was a bit worried about List.foldr but it worked out fine, the choices are in the right order:

> F.exampleWithRuns (fromList [F.constant 1, F.oneOfValues [42, 43], F.int 90 99])
ExamplesWithRuns
  [ { example = [1,43,99], randomRun = [1,9] } -- `constant` generates no choice,
  , { example = [1,42,91], randomRun = [0,1] } -- randomRun[0] corresponds to the `oneOfValues`
  , { example = [1,42,93], randomRun = [0,3] } -- randomRun[1] corresponds to the `int`
  , { example = [1,43,92], randomRun = [1,2] }
  , ...
  ]

So you should be fine!

I think I could include this in the lib, as sequence or combine or some such.

mattpiz · August 3, 2020, 8:43am

I’ve created fuzz tests that heavily rely on andThen to generate sets package names and dependency names that are correlated. This enabled testing of elm-pubgrub and uncovered a bug triggered when the root package is not compatible with itself (root depends on not root). Now I’m a little more confident that the Debug.todo branches left in elm-pubgrub are unreachable, so thank you for elm-minithesis!

My tests are here if you are interested in having a look. Among the new functions I needed, some may be generic enough to be included in minithesis. There is the sequence one as you called it earlier.

sequence : List (Fuzzer a) -> Fuzzer (List a)
sequence =
    List.foldr (map2 (::)) (constant [])

Another one is some kind of a subset or subsetBy function, which picks up some elements from a list. (I’ve added below a possible implementation, only correct if no duplicates are wanted)

subset : List comparable -> Fuzzer (List comparable)
subset l =
    uniqueList (oneOfValues l)

subsetBy : (a -> comparable) -> List a -> Fuzzer (List a)
subsetBy f l =
    uniqueByList f (oneOfValues l)

Maybe subset could even take an unconstrained type parameter if there was some kind of a shuffle : List a -> Fuzzer (List a) fuzzer.

subset : List a -> Fuzzer (List a)
subset l =
    map2 List.take (int 0 (List.length l)) (shuffle l)

A wild guess at a shuffle implementation:

shuffle : List a -> Fuzzer (List a)
shuffle l =
    List.map (\a -> map2 tuple anyNumericInt (constant a)) l
        -- List (Fuzzer (Int, a))
        |> sequence
        -- Fuzzer (List (Int, a))
        |> map (List.sortBy Tuple.first)
        -- Fuzzer (List (Int, a))
        |> map (List.map Tuple.second)

system · August 13, 2020, 8:43am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elm-minithesis: shrinking without compromises Show and Tell	27	2250	August 9, 2020
Idea: add statistical labeling to elm-test Request Feedback	22	1894	June 1, 2019
Gathering feedback on elm-test-rs before 1.0 release Request Feedback	21	1749	February 23, 2021
Elm-test-tables; a collection of useful elm-test extensions Show and Tell	1	821	June 4, 2018
🎙 Elm Radio 62: elm-test v2 with Martin Janiczek Show and Tell	2	394	August 11, 2022

Elm-minithesis: gathering feedback and benchmarks

1. API feedback

2. Benchmark against elm-test

3. Compare behaviour against elm-test

Related topics

2. Benchmark against `elm-test`

3. Compare behaviour against `elm-test`