Elm-minithesis: gathering feedback and benchmarks

Hello folks!

A week ago I posted about elm-minithesis.

I believe it’s roughly ready for publishing / for larger discussions about inclusion to elm-test, but I’d like to first

  1. gather feedback on the API
  2. benchmark against elm-test
  3. compare behaviour against against elm-test

:pray: For that, I need your help! :pray: Let’s talk about each point:

1. API feedback

Here are the preview docs (thanks @jfmengels!)

Could you please read it through and tell me your suggestions for wording and even API design? Eg. there are differences from elm-test Fuzz module like:

-- elm-test:
intRange : Int -> Int -> Fuzzer Int
int : Fuzzer Int

-- elm-minithesis:
int : Int -> Int -> Fuzzer Int
anyNumericInt : Fuzzer Int

Could you also try using the library - rewriting some of your existing fuzz tests in this style and telling me what was surprising / unexpected / confusing? (Also perhaps contributing them to the benchmarks, see #2) (You’ll need to vendor the elm-minithesis source code / add it to source-directories since it’s not published yet.)

2. Benchmark against elm-test

I’ve created an example benchmark and a template for writing your own:

main =
        { name = "int 0 10000"
        , minithesisFuzzer = MF.int 0 10000
        , elmTestFuzzer = F.intRange 0 10000
        , minithesisFn = \i -> i < 5000
        , elmTestFn = \i -> Expect.lessThan 5000 i

I’d like to benchmark various fuzzers and how they behave with various test functions. Perhaps we find out that frequency is much slower in elm-minithesis and needs to be optimized, or something similar.

Also, (real-world) combinations of fuzzers would be helpful.

3. Compare behaviour against elm-test

There are differences:

  • elm-minithesis stops doing extra work after it finds a failing example and shrinks it fully; I suspect elm-test finishes all 100 runs even if it already found and shrunk an conterexample. If true this probably makes the benchmarks a little bit apples-to-oranges. The elm-minithesis behavour makes sense to me though.
  • There might be differences in distributions of floats, lists, etc. - I hope some of your testing could uncover bugs / issues, like:
    • “I’d expect it to find a counterexample for XYZ but it never did!”
    • “With these frequency weights my elm-test test never triggered stack overflow, but elm-minithesis does!”
  • Unexpected shrink “targets”: eg. I know of a bug where int -5000 5000 will shrink towards -5000 instead of 0. Are there more?

I’ll be very glad for any feedback you give me on this.

I’m in touch with @drathier about possible integration of this into elm-test, and the benchmarks etc. are an important steps before we can decide that in any more detail.

Thanks and stay safe, folks!


Regarding API differences: I think if the idea is to get to eventually replace the fuzzers in elm-test, than having the same API would be pretty neat, as it could be a more or less drop in upgrade.

(Of course there will be some differences like custom going away and andThen coming back in, but I suspect that the vast majority of elm-test users don’t actually use custom).

Right, I agree. If this is to be integrated into elm-test then the published surface API should be drop-in (unless maintainers had some changes planned for a major version release which could then be snuck in as this would definitely be a major version).

I also feel like in case of publishing outside of elm-test I’d want to create a better API from ground up since I don’t have legacy users, but … that would probably hamper migration from elm-test fuzzers a bit. Will have to mull that over a bit :slight_smile:

Regarding API, I just found myself needing a function of type List (Fuzzer a) -> Fuzzer (List a) so I created like follows. Is this friendly with the shrinking mechanism of elm-minithesis?

fromList : List (MF.Fuzzer a) -> MF.Fuzzer (List a)
fromList list =
    List.foldr (MF.map2 (::)) (MF.constant []) list

I arrived in this situation because I want to generate correlated lists of dependencies, i.e. the package names in the dependencies are the same that those available:

type alias Dependencies =
    List ( ( String, Version ), List ( String, Range ) )

{-| Names of packages are the same that those of dependencies.
correlatedDependencies : MF.Fuzzer Dependencies
correlatedDependencies =
    MF.list name
        |> MF.andThen
            (\packages ->
                fromList <|
                        (\p ->
                                (MF.tuple (MF.constant p) version)
                                (Debug.todo "List of dependencies, use names from `packages`")

Is there a better way to do this that do not involve the fromList function?

By the way, without the docs.json file, the docs preview does not show the API doc

Thanks, I put it back in!

I was a bit worried about List.foldr but it worked out fine, the choices are in the right order:

> F.exampleWithRuns (fromList [F.constant 1, F.oneOfValues [42, 43], F.int 90 99])
  [ { example = [1,43,99], randomRun = [1,9] } -- `constant` generates no choice,
  , { example = [1,42,91], randomRun = [0,1] } -- randomRun[0] corresponds to the `oneOfValues`
  , { example = [1,42,93], randomRun = [0,3] } -- randomRun[1] corresponds to the `int`
  , { example = [1,43,92], randomRun = [1,2] }
  , ...

So you should be fine!

I think I could include this in the lib, as sequence or combine or some such.

I’ve created fuzz tests that heavily rely on andThen to generate sets package names and dependency names that are correlated. This enabled testing of elm-pubgrub and uncovered a bug triggered when the root package is not compatible with itself (root depends on not root). Now I’m a little more confident that the Debug.todo branches left in elm-pubgrub are unreachable, so thank you for elm-minithesis!

My tests are here if you are interested in having a look. Among the new functions I needed, some may be generic enough to be included in minithesis. There is the sequence one as you called it earlier.

sequence : List (Fuzzer a) -> Fuzzer (List a)
sequence =
    List.foldr (map2 (::)) (constant [])

Another one is some kind of a subset or subsetBy function, which picks up some elements from a list. (I’ve added below a possible implementation, only correct if no duplicates are wanted)

subset : List comparable -> Fuzzer (List comparable)
subset l =
    uniqueList (oneOfValues l)

subsetBy : (a -> comparable) -> List a -> Fuzzer (List a)
subsetBy f l =
    uniqueByList f (oneOfValues l)

Maybe subset could even take an unconstrained type parameter if there was some kind of a shuffle : List a -> Fuzzer (List a) fuzzer.

subset : List a -> Fuzzer (List a)
subset l =
    map2 List.take (int 0 (List.length l)) (shuffle l)

A wild guess at a shuffle implementation:

shuffle : List a -> Fuzzer (List a)
shuffle l =
    List.map (\a -> map2 tuple anyNumericInt (constant a)) l
        -- List (Fuzzer (Int, a))
        |> sequence
        -- Fuzzer (List (Int, a))
        |> map (List.sortBy Tuple.first)
        -- Fuzzer (List (Int, a))
        |> map (List.map Tuple.second)
1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.