Issue created!
So, for example, tests can’t be
Test
but must beTest a
wherea
is the type of the value generated by the fuzzer.elm-explorations/test
does some black magic withElm.Kernel.Debug.toString
so that it doesn’t have to
I made a rough proposal a while ago that might be relevant here: Enable test runners to produce structured diff · Issue #147 · elm-explorations/test · GitHub. It would be be good to remove or atleast encapsulate that black magic a bit.
Forgive me for asking a question before I have read about minithesis fully:
How does this shrinking method ensure that generating smaller random integers gives simpler test cases? Is it just by careful design of the Fuzz.xxxx
functions?
Yes, in general the shrinking is sensitive to how the Fuzzers are implemented. For example, two ways to write a List fuzzer would be:
1. Throw a coin to decide whether to generate another item
(Used in elm-minithesis right now)
list : Fuzzer a -> Fuzzer (List a)
list itemFuzzer =
let
go : List a -> Fuzzer (List a)
go acc =
Fuzz.weightedBool 0.9
|> Fuzz.andThen (\bool ->
if bool then
itemFuzzer
|> Fuzz.andThen (\item -> go (item :: acc))
else
Fuzz.constant (List.reverse acc)
)
in
go []
2. Decide a list length beforehand
(didn’t try, and it has a different distribution and other problems, but is possible)
list : Fuzzer a -> Fuzzer (List a)
list itemFuzzer =
let
go : Int -> List a -> Fuzzer (List a)
go leftToDo acc =
if leftToDo <= 0 then
Fuzz.constant (List.reverse acc)
else
itemFuzzer
|> Fuzz.andThen (\item -> go (leftToDo - 1) (item :: acc))
in
Fuzz.int 0 20
|> Fuzz.andThen (\length -> go length [])
The two approaches will have different representations for the same values:
Generated list value | PRNG for 1) Throw a coin | PRNG for 2) Decide length beforehand |
---|---|---|
[] |
[0] |
[0] |
[42] |
[1, 42, 0] |
[1, 42] |
[42, 99] |
[1, 42, 1, 99, 0] |
[2, 42, 99] |
[42, 99, 5] |
[1, 42, 1, 99, 1, 5, 0] |
[3, 42, 99, 5] |
And again, given shrinkers are generic and aren’t customized per fuzzer, some fuzzer strategies might shrink better than others.
As an anecdote, I’m currently reading how generating Floats works in Hypothesis. It’s intense. And cares more about how it shrinks than what the distribution is!
Basic Float fuzzers are now live! By @ianmackenzie’s testing they shrink to nice readable values, so… I consider that a success
The main current blocker is:
- porting the
frequency
fuzzer which in Hypothesis uses a fancy algorithm - fleshing out the
floatWith
fuzzer using that
After that, I believe we can start testing it, benchmarking and comparing it against elm-test
- I’d love your help with that. I’ll write this up in more detail when the codebase is ready
I wonder if the fancy algorithm is worth it, given that arrays are not exactly constant time in Elm.
Seems like some benchmarking in Python seems to indicate that it only becomes worthwhile with extremely large numbers of samples:
Is it possible to try elm-minithesis for package tests before it gets published? I’d be interested in trying to fuzz sets of dependencies to see if some of my Debug.todo
branches get triggered. How did you try it @ianmackenzie?
I feel like currently the easiest solution would be to clone elm-minithesis
somewhere and add that path to your source-directories
. I’m preparing a “call for testing and feedback” post where I’ll say how people can do this, but hopefully the above will unblock you?
From my conversations with Ian, I believe he has just written his test inside cloned elm-minithesis
repo.
The issue is there is no source-directories
for a package. But there are only 3 items in your src/
directory so I’ll try just symlink those
I can confirm that simlinks work. I’ve just written my first elm-minithesis test and got the that nicely shrunk failure.
✗ uncorrelatedDependencies
Given 0
FailsWith [(("",Version { major = 0, minor = 0, patch = 0 }),("",Range []))]
╷
│ Expect.equal
╵
Passes
It took 1 full minute though on a 10th gen i7 CPU to fuzz these values. The fuzzer is built like this:
type alias Dependencies =
List ( ( String, Version ), ( String, Range ) )
{-| Names of packages and their dependencies are not correlated,
so most probably, this will lead to an unsolvable set of dependencies.
-}
uncorrelatedDependencies : MF.Fuzzer Dependencies
uncorrelatedDependencies =
MF.list (MF.map2 Tuple.pair packageVersion packageRange)
{-| Random package name and range
-}
packageRange : MF.Fuzzer ( String, Range )
packageRange =
MF.map2 Tuple.pair name range
{-| Random package name and version
-}
packageVersion : MF.Fuzzer ( String, Version )
packageVersion =
MF.map2 Tuple.pair name version
{-| String with 3 times less chance to be empty
-}
name : MF.Fuzzer String
name =
MF.map3
(\s1 s2 s3 ->
if String.isEmpty s1 then
if String.isEmpty s2 then
s3
else
s2
else
s1
)
MF.string
MF.string
MF.string
range : MF.Fuzzer Range
range =
MF.map2
(\v1 v2 -> Range.between (Version.min v1 v2) (Version.max v1 v2))
version
version
version : MF.Fuzzer Version
version =
MF.map3 Version.new_ positive positive positive
positive : MF.Fuzzer Int
positive =
MF.int 0 100
Great, thanks for sharing @mattpiz!
One thing that is probably not obvious enough / will need changing is that Minithesis.run
by default tries to generate the value 1000 times or until it gets 100 generated values. Combine this with elm-test
default --fuzz 100
and you potentially get up to 100k generation attempts for a single test. So that’s where your 1 minute run might come from. We might have to tweak the defaults a little if publishing functions for use with elm-test
.
Looking at your fuzzers, they look fine to me - I’d perhaps use a different implementation for name
, but it shouldn’t matter much (if shrinking is fine for that fuzzer).
I still have a few TODOs regarding what values do the fuzzers shrink towards: it seems negative floats are preferred, and when using anyNumericInt
it leans towards the negative values also. This is all solvable, it’s just things I didn’t port from the Hypothesis repo yet
Ah that’s good to know and probably the reason indeed. I’ll wait for your “call for testing and feedback” before continuing but I was curious enough that I wanted to try now
@mattpiz Just FYI: I’ve started benchmarking elm-minithesis and one of the first benchmarks were various implementations of your “less empty string” fuzzer. See the code here: https://github.com/Janiczek/elm-minithesis/blob/master/benchmarks/fuzzer-implementations/less-empty-string/src/Main.elm
I had not looked for the stringWith
function, that’s really cool! One remark regarding its interface, an average length does not say much without knowing the associated distribution (or simply standard deviation if Gaussian distribution is assumed)
Thanks @mattpiz. Yeah it internally converts to a continue generating probability
according to geometric distribution. Perhaps I should do
stringWith
{ minLength : Maybe Int
, maxLength : Maybe Int
, continueProbability : Maybe Float
}
instead?
Or do you think a doc comment explaining the distribution would be enough?
Another benchmark done: this time comparing minithesis vs elm-test:
Fuzzer:
-- elm-minithesis:
Minithesis.Fuzz.int 0 10000
-- elm-test:
Fuzz.intRange 0 10000
Test function:
-- elm-minithesis:
(\i -> i < 5000)
-- elm-test:
(\i -> Expect.lessThan 5000 i)
And here are the results:
Looks like:
- minithesis is taking ~3x longer when generating the value (we have additional overhead because we have to remember the PRNG choices; elm-test can just use the
Random.Generator
as is) - elm-test is taking ~10x longer when shrinking the values
All in all, that’s encouraging
Benchmark code: https://github.com/Janiczek/elm-minithesis/blob/master/benchmarks/vs-elm-test/int/src/Main.elm
This distribution where the mean value is related to the probability of success with mean = (1-p) / p
right? I don’t know what would be better, but simply looking at it now gives a better idea of what kind of values may appear. Contrary to what most people might think, the average value is not where the probability density function is max (which is always 0). And increasing the average value tends to a uniform distribution. That may be counter-intuitive and probably best to illustrate in the doc.
For anybody who’s interested: I posted a call for testing and feedback: Elm-minithesis: gathering feedback and benchmarks
This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.