Improving data structure diff in elm-test

michaeljones · August 2, 2019, 6:20pm

I might be wrong but it feels like the error displayed from elm-test when an Expect.equal fails, is a string diff of the result of running Debug.toString on the two values being compared.

This is more than adequate for simple values but for nested mixes of records & lists it feels less helpful and sometimes quiet a strain to read in the terminal output.

I was wondering if there is a chance of using a similar approach to the Debug.toString but instead converting to a generic data representation, perhaps something like this:

type DataType
    = IntData Int
    | FloatData Float
    | StringData String
    | TupleData (List DataType )
    | ListData (List DataType)
    | RecordData (List ( String, DataType ))
    | ...

And then diffing the two results in Elm with a clearer display closer to the diff one might get from modern JS testing frameworks.

It assumes that all data is ultimately translatable to such a structure and I don’t know if that is true. You might notice I haven’t included a representation for custom types though I feel it could be possible.

I tried to experiment with the concept a little but it involves Kernel code and the changes I’ve attempted in my local clone of elm-explorations/test result in a Corrupt Dependency error when running the ./tests/test.sh script and I’ve no idea how to progress.

Do you think this idea has merit? I suspect that if it did it would have been tried already but I’m curious all the same.

harrysarson · August 6, 2019, 8:31pm

I would love to see diffs that do not use Debug.toString. Firstly, they will look nicer and the code would be less whacky. Secondly, it would be nice (in the long term) to he able to run tests on optimised compiles which is only possible if the dependency on Debug.toString is dropped.

The code that (currently) does the diffing is the test runner (https://github.com/rtfeldman/node-test-runner) rather than elm-explorations/elm. (I think?) It would need to be moved into the elm package if it were to use kernel code.

brian · August 6, 2019, 10:16pm

Can you share examples of the kinds of thing you’re looking to improve? I’ve had issues with the diff highlighting too, but I wonder if we’re jumping to a solution too early here? What would you want diffs to look like under the new scheme?

michaeljones · August 7, 2019, 7:46am

Thank you both for the responses. I’ll work up some examples to better illustrate it and, yeah, good point, there might be better solutions if one is needed.

michaeljones · August 10, 2019, 1:47pm

A kind commenter on Slack noted that I was really asking for structured diffs over textual diffs. I guess structured diff-ing is a specific concept and approach that makes sense in these situations.

I’ve done a small amount of research with a comparison to Jest in the JS world that seems to do more structured diffs. Here is a simple jest example:

And here is a similar arrangement for elm:

Here the experiences are similar though not the identical, which is fine.

I’ve then attempted to create a slightly more ‘worst case’ scenario for the text diffing approach by having names & values which include Just & Nothing. It is hard to do a direct comparison due to the nature of the two languages’ data representations, but here is a Jest diff:

And here is an Elm diff:

I think particularly in the first half of the Elm diff we can see the the approach fails to convey the same amount of information as the Jest version.

For reference the code for these examples is here: https://github.com/michaeljones/diff-comparison

I feel like I have experienced quite unpleasant diffs in the wild on production code where the result feels like a random patchwork of highlighted characters. I will attempt to share such an example the next time I experience it.

mgold · August 13, 2019, 3:19am

I’m totally open to accepting PRs that improve elm-test’s diffs, but I’m unable to work on this myself.

I’ve filed a companion issue on GitHub.

harrysarson · August 13, 2019, 6:45am

The difference display was even worse before. I did some work to remove highlighting for very different strings in https://github.com/rtfeldman/node-test-runner/pull/336 - I thought it might be if interest.

michaeljones · August 13, 2019, 7:23am

Interesting to know. Thank you for sharing.

I’ve been messing around with the initial idea above of doing a diff once the data structures have been translated into some kind of generic representation in Elm. I don’t know if it is possible to pursue this line but if anyone is interested my efforts are here:

github.com

michaeljones/diff-comparison/blob/master/elm/src/DataType.elm

module DataType exposing (..)

import Html exposing (Html, div, pre, text)


type DataType a
    = IntData Int
    | FloatData Float
    | StringData String
    | TupleData (List (DataType a))
    | ListData (List (DataType a))
    | RecordData (List ( String, DataType a ))
    | CustomTypeData String (List (DataType a))


dataTypeToString dataType =
    case dataType of
        IntData value ->
            String.fromInt value

This file has been truncated. show original

They aren’t based on any research so it might be flawed and the output is far from perfect at the moment but it has been fun to try to think through the problem a bit.

system · August 23, 2019, 7:24am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to upgrade from 0.18 to 0.19 without toString abstraction Learn	12	2887	October 11, 2018
Elm-test-tables; a collection of useful elm-test extensions Show and Tell	1	821	June 4, 2018
Nicer Debug.log console output Show and Tell	14	3812	July 8, 2019
Elm Debug Transformer v1.0 Show and Tell	3	862	September 15, 2019
What is equivalent of Haskell 'show' Learn	6	1663	June 20, 2019

Improving data structure diff in elm-test

Related topics