Hello! I’ve ported @'drmaciver’s Minithesis to Elm: elm-minithesis. (Phew, I’ve tried porting Hypothesis a few times over the years and never managed to do it, it’s very stateful.)
It’s a property based testing library similar to elm-explorations/test
's Fuzz module, with the difference that it uses a shrinking method that has less disadvantages.
In short, it shrinks the underlying PRNG’s decisions instead of the generated values themselves, and that sidesteps the conventional shrinkers’ trouble with shrunk andThen
-ed values no longer satisfying preconditions, since it instead generates the values from scratch each time.
So, elm-minithesis
has Fuzz
API much closer to the Random.Generator
APIs; doesn’t expose any shrinking details to you the user, and lets you use andThen
in your fuzzers.
I’m planning to add some adapters so that it can be used comfortably in elm-explorations/test
, define some more fuzzers and then publish it.
See the test suite for usage examples.
Do you have any suggestions / questions / concerns? I’d like to get the 1.0.0 release as good as possible
25 Likes
Extremely cool. I’m looking forward to reading the source and getting some better understanding of the differences.
Are you planning on (eventually) contributing this to elm-exploration/test? I feel like having a standard testing package is pretty handy and this seems like an easier approach (for the user).
1 Like
Looks cool! Is this approach also resistant against causing stack overflows?
I still want to clean up and refactor the code after I get a bit of distance from it after the initial port At least some nicer separation into modules. But hopefully it should be easy to digest. Reading from the first commit forward helps too.
I would love to. Doing this as a separate library has had some differences in API as a result, since I can’t use JS kernel functions. So, for example, tests can’t be Test
but must be Test a
where a
is the type of the value generated by the fuzzer. elm-explorations/test
does some black magic with Elm.Kernel.Debug.toString
so that it doesn’t have to
Also, so far I believe this approach to be superior but I (or somebody) will have to compare/benchmark them side by side a bit closer. Right now it’s all just big words without any numbers
I’ll create an issue in elm-explorations/test
to start discussion about adopting this approach.
I believe andThen
in principle can’t be totally safe against stack overflows; for example:
import Minithesis as M
import Minithesis.Fuzz as F
badFuzzer =
F.unit
|> F.andThen (\() -> badFuzzer)
M.run 0 <| M.test "" badFuzzer (\_ -> True)
--> RangeError: Maximum call stack size exceeded
But I believe I could include some trampolining / looping support like what elm/bytes
or elm/parser
have. (Maybe I’m conflating two unrelated ideas, didn’t look that much into it yet.)
1 Like
Gotcha. I understand that recursively defined fuzzers can overflow if the user is careless. I was thinking more of elm-test’s tendency to cause stack overflows when met with even slightly complicated fuzzers (i.e. Fuzz.list (Fuzz.map2 Tuple.pair Fuzz.float Fuzz.float)
)
Looks really cool! The only obvious feedback I have is that it would be great to flesh out the Fuzz
module with support for more complex things like Float
s and Char
s and String
s.
I’d love to have a Float
fuzzer with better shrinking behavior than elm-test
- I think it currently just tries to make Float
values as small as possible, which means I end up getting test failures in elm-geometry
for things like “a point with coordinates (0.0000001, 0.00000003, -0.000000002)” which is pretty hard to read/reason about. It would be great to try to set things up so the shrinking tended to be more towards more “round” numbers with fewer decimal digits or something like that.
1 Like
I was reading the paper on this and coincidentally came across this right after you mentioned how 0.000001 is less useful than 1.0
3 Likes
Ah, I remember hitting that too myself.
The answer right now is “I don’t know”; I’ll have to test elm-minithesis
more thoroughly and find its limits.
One thing I can say is, I deliberately followed case...of
pattern instead of stuff |> Result.andThen |> Result.withDefault
etc., so hopefully the library itself doesn’t make the situation much worse than the user-defined fuzzers already will. But, again, testing and measuring needed.
Oh yeah. Right now it’s basically just Ints, Bools, Lists I will add more fuzzers soon!
Hmm, interesting. Floats are tricky I don’t have any clear answers right now, I’ll probably ping you when I start playing with Float fuzzers and we can try and come up with something that would be useful to your failing tests?
(And yeah, that paper @MartinS linked is great!)
1 Like
So, for example, tests can’t be Test
but must be Test a
where a
is the type of the value generated by the fuzzer. elm-explorations/test
does some black magic with Elm.Kernel.Debug.toString
so that it doesn’t have to
I made a rough proposal a while ago that might be relevant here: Enable test runners to produce structured diff · Issue #147 · elm-explorations/test · GitHub. It would be be good to remove or atleast encapsulate that black magic a bit.
1 Like
Forgive me for asking a question before I have read about minithesis fully:
How does this shrinking method ensure that generating smaller random integers gives simpler test cases? Is it just by careful design of the Fuzz.xxxx
functions?
Yes, in general the shrinking is sensitive to how the Fuzzers are implemented. For example, two ways to write a List fuzzer would be:
1. Throw a coin to decide whether to generate another item
(Used in elm-minithesis right now)
list : Fuzzer a -> Fuzzer (List a)
list itemFuzzer =
let
go : List a -> Fuzzer (List a)
go acc =
Fuzz.weightedBool 0.9
|> Fuzz.andThen (\bool ->
if bool then
itemFuzzer
|> Fuzz.andThen (\item -> go (item :: acc))
else
Fuzz.constant (List.reverse acc)
)
in
go []
2. Decide a list length beforehand
(didn’t try, and it has a different distribution and other problems, but is possible)
list : Fuzzer a -> Fuzzer (List a)
list itemFuzzer =
let
go : Int -> List a -> Fuzzer (List a)
go leftToDo acc =
if leftToDo <= 0 then
Fuzz.constant (List.reverse acc)
else
itemFuzzer
|> Fuzz.andThen (\item -> go (leftToDo - 1) (item :: acc))
in
Fuzz.int 0 20
|> Fuzz.andThen (\length -> go length [])
The two approaches will have different representations for the same values:
Generated list value |
PRNG for 1) Throw a coin |
PRNG for 2) Decide length beforehand |
[] |
[0] |
[0] |
[42] |
[1, 42, 0] |
[1, 42] |
[42, 99] |
[1, 42, 1, 99, 0] |
[2, 42, 99] |
[42, 99, 5] |
[1, 42, 1, 99, 1, 5, 0] |
[3, 42, 99, 5] |
And again, given shrinkers are generic and aren’t customized per fuzzer, some fuzzer strategies might shrink better than others.
As an anecdote, I’m currently reading how generating Floats works in Hypothesis. It’s intense. And cares more about how it shrinks than what the distribution is!
2 Likes
Basic Float fuzzers are now live! By @ianmackenzie’s testing they shrink to nice readable values, so… I consider that a success
7 Likes
The main current blocker is:
- porting the
frequency
fuzzer which in Hypothesis uses a fancy algorithm
- fleshing out the
floatWith
fuzzer using that
After that, I believe we can start testing it, benchmarking and comparing it against elm-test
- I’d love your help with that. I’ll write this up in more detail when the codebase is ready
I wonder if the fancy algorithm is worth it, given that arrays are not exactly constant time in Elm.
Seems like some benchmarking in Python seems to indicate that it only becomes worthwhile with extremely large numbers of samples:
https://bugs.python.org/msg197540
1 Like
Is it possible to try elm-minithesis for package tests before it gets published? I’d be interested in trying to fuzz sets of dependencies to see if some of my Debug.todo
branches get triggered. How did you try it @ianmackenzie?
I feel like currently the easiest solution would be to clone elm-minithesis
somewhere and add that path to your source-directories
. I’m preparing a “call for testing and feedback” post where I’ll say how people can do this, but hopefully the above will unblock you?
From my conversations with Ian, I believe he has just written his test inside cloned elm-minithesis
repo.
The issue is there is no source-directories
for a package. But there are only 3 items in your src/
directory so I’ll try just symlink those
1 Like
I can confirm that simlinks work. I’ve just written my first elm-minithesis test and got the that nicely shrunk failure.
✗ uncorrelatedDependencies
Given 0
FailsWith [(("",Version { major = 0, minor = 0, patch = 0 }),("",Range []))]
╷
│ Expect.equal
╵
Passes
It took 1 full minute though on a 10th gen i7 CPU to fuzz these values. The fuzzer is built like this:
type alias Dependencies =
List ( ( String, Version ), ( String, Range ) )
{-| Names of packages and their dependencies are not correlated,
so most probably, this will lead to an unsolvable set of dependencies.
-}
uncorrelatedDependencies : MF.Fuzzer Dependencies
uncorrelatedDependencies =
MF.list (MF.map2 Tuple.pair packageVersion packageRange)
{-| Random package name and range
-}
packageRange : MF.Fuzzer ( String, Range )
packageRange =
MF.map2 Tuple.pair name range
{-| Random package name and version
-}
packageVersion : MF.Fuzzer ( String, Version )
packageVersion =
MF.map2 Tuple.pair name version
{-| String with 3 times less chance to be empty
-}
name : MF.Fuzzer String
name =
MF.map3
(\s1 s2 s3 ->
if String.isEmpty s1 then
if String.isEmpty s2 then
s3
else
s2
else
s1
)
MF.string
MF.string
MF.string
range : MF.Fuzzer Range
range =
MF.map2
(\v1 v2 -> Range.between (Version.min v1 v2) (Version.max v1 v2))
version
version
version : MF.Fuzzer Version
version =
MF.map3 Version.new_ positive positive positive
positive : MF.Fuzzer Int
positive =
MF.int 0 100
1 Like
Great, thanks for sharing @mattpiz!
One thing that is probably not obvious enough / will need changing is that Minithesis.run
by default tries to generate the value 1000 times or until it gets 100 generated values. Combine this with elm-test
default --fuzz 100
and you potentially get up to 100k generation attempts for a single test. So that’s where your 1 minute run might come from. We might have to tweak the defaults a little if publishing functions for use with elm-test
.
Looking at your fuzzers, they look fine to me - I’d perhaps use a different implementation for name
, but it shouldn’t matter much (if shrinking is fine for that fuzzer).
I still have a few TODOs regarding what values do the fuzzers shrink towards: it seems negative floats are preferred, and when using anyNumericInt
it leans towards the negative values also. This is all solvable, it’s just things I didn’t port from the Hypothesis repo yet