How to write fuzz tests for a function which generates a dictionary?

hanifhefaz · December 26, 2020, 2:55pm

Hello Everyone,

I have a function called wordsDict, which generates dictionary or a list of dictionaries from some strings, for each of its word. I have designed some tests and they work fine.

the function is:

wordsDict : List String -> List Dict String Int
wordsDict =
List.map (tokenize >> toHistogram)

for example, the test:

test "test making dictionary from the data with two sentences." <|
            \() ->
                let
                    dataText =
                        [ "test", "testing" ]
                in
                Expect.equal [ Dict.fromList [ ( "test", 1 ) ], Dict.fromList [ ( "testing", 1 ) ] ]
                    (dataText |> Main.wordsDict)

This test runs and passes as expected.

Now if I understand it clearly, fuzz testing is used to randomly give the input a 100 times with different strings.

How can I implement a fuzz test for this function? Will fuzz testing change these two strings?
[ "test", "testing" ]

Thank You.

mgold · December 29, 2020, 5:44am

Let’s take a step back. Why do you want to fuzz test this function? Unit testing is one of those things that you can find a lot of opinions about on the internet. But as long as you are satisfied that your code is correct, then you are testing well. So maybe start with, what’s the bug surface, what bugs do you want to show are not present in your code?

I’m worried about many repeated words. Perhaps something like this: (I have not compiled these tests; some small tweaks may be necessary)

fuzz (Fuzz.intRange 0 1000) "counts repeated words" <| \i ->
  List.repeat i "barnacles" |> Main.wordsDict |> Expect.equal [Dict.fromList [("barnacles", i)]]

This should give you confidence that you can count large numbers of repetitions, BUT only when there are no other words present and and the word is "barnacles". You could also map over List.range 0 10 and make unit tests for those cases, instead of the fuzz test.

I’m worried about certain strings being treated specially. I’m usually not, if I can look at the code and see that it’s not looking at lengths, prefixes, a word list, etc., and doing nefarious things in those cases. You could write a fuzz test that takes in random strings, or a randoms string to use instead of "barnacles".

But if I’m making a data structure that treats strings generically then… you can use the type system. Why not make the signature wordsDict : List comparable -> List (Dict comparable Int)? By making the function more generic, the compiler will reject any attempts to say, not count the empty string. (Unless that’s desired behavior, in which case, write a unit test.)

I’m worried about counting up all the words. One of the great things about fuzz tests is that you don’t have to compute the output value of the function under test, only some invariant that will be true about it. Let’s test that the length of the input list will equal the sum of the values of the dictionary.

-- use a small word list to ensure duplicates
myWords = ["barnacles", "turtles", "whales", "sharks"]

-- effective Elm programming involves combining simple functions together
listFrom : List a -> Fuzz.Fuzzer (List a)
listFrom words = words |> List.map Fuzz.constant |> Fuzz.oneOf |> Fuzz.list

fuzz (listFrom myWords) "length of input = sum of values of output" <| \input ->
  input |> Main.wordsDict |> Dict.values |> List.sum |> Expect.equal (List.length input)

You can write a similar test to check that all values are positive.

hanifhefaz · December 29, 2020, 4:22pm

Thank you for the reply! It is really gold!

I wanted to test the function for different inputs. for example, now I have unit tests for cases where the input is with small letters, and another unit test where the input is both small letters and capital letters. I wanted to include each and every possible inputs for testing, including alphanumeric.

OK, I came up with this, which is giving the expected result.
fuzz (Fuzz.intRange 1 10) "counts repeated words" <| \i -> let searchString = "what" in Expect.equal [ ( "what", i ) ] (List.repeat i searchString |> Main.wordsDict |> Dict.toList)

That was the first problem, which made me ask this question. How would you do that? by converting the Fuzz.intRange? All I wanted was to write a simple fuzz test, which takes random strings of any length, including small letters, capitals, symbols and numbers.

That is also very helpful. I came up with this solution:
fuzz (listFrom myWords) "search string length is equal to the sum of output dictionary values" <| \input -> Expect.equal (List.length input) (input |> Main.wordsDict |> Dict.values |> List.sum)

Thank you again for the detailed answer.

mgold · December 29, 2020, 5:04pm

This will run 100 tests but there are only 10 possible tests to run, so it’s inefficient. It’s not terrible, but you might as well use a higher upper bound.

Do you know about fuzz2?

fuzz2 (Fuzz.intRange 1 10) Fuzz.string "counts repeated words" <| \i searchString ->
  Expect.equal [ ( "what", i ) ] (List.repeat i searchString |> Main.wordsDict |> Dict.toList)

Now, what if you want to test two different words? You could use fuzz4 to pass in two words and two lengths… but the better solution is to make a custom fuzzer.

listOfOneWord : Fuzz.Fuzzer (List String)
listOfOneWord = Fuzz.map2 List.repeat (Fuzz.intRange 1 2000) Fuzz.string

Now you can implement

fuzz (Fuzz.list listOfOneWord) "concatenation of word lists yields the same dict as combining the dicts of each list" <| \listOfLists -> ....

Given a list of lists of words, I can either make wordDicts out of each list and then combine the dictionaries, or I can concatenate the lists and then call wordDict once, and that should give me the same dictionary. Abstractly, f(g(x)) == g(f(x)). That would be a good invariant to test.

You will need to handle the case where some of the lists contain the same word: use Dict.update to add the counts, instead of Dict.insert which will replace them.

hanifhefaz · December 29, 2020, 6:10pm

I have not used it before. but looks easy. it takes an Int range and a random string. anyway, I used it like this:

fuzz2 (Fuzz.intRange 1 100) Fuzz.string "counts repeated words." <|
            \i searchString ->
                Expect.equal [ ( searchString, i ) ]
                    (List.repeat i searchString |> Word2DictMatcher.wordsDict |> Dict.toList)

in my case it gave me this error, which is probably something regarding the syntax:

Given (1,"\t")


[("",1)]
╷
│ Expect.equal
╵
[("\t",1)]

Yes, I wanted to ask about this. the previous fuzz test, tests only for a single word. but fine, the custom fuzzer solution looks good. let me play with that.

mgold · December 29, 2020, 11:09pm

That looks like a test failure. Is your implementation doing something weird with spaces and tabs? It may be helpful to extract the fuzz test to a unit test so you can get a better grip on this particular error.

system · January 8, 2021, 11:09pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elm-test-tables; a collection of useful elm-test extensions Show and Tell	1	840	June 4, 2018
How to write a Fuzzer (List a) for a non-empty List Learn	21	1347	April 23, 2020
String Fuzzer Library Request Feedback	6	721	December 1, 2018
Elm-common-tests library + test results across the ecosystem Show and Tell	4	589	December 28, 2022
Idea: add statistical labeling to elm-test Request Feedback	22	2008	June 1, 2019

How to write fuzz tests for a function which generates a dictionary?

Related topics