Small experiment with LLM and TDD

janjelinek · September 1, 2025, 9:27am

Recently I had a small task involving seat label grouping logic. Since it was relatively clear and straightforward, I decided to use a TDD approach. When I was in the middle of writing test cases, I thought I’d try how well an LLM would handle solving this and giving me the complete code solution.

Since I only have access to ChatGPT and Copilot, I decided to go directly with GPT-5 only.

The goal was simple: define all the test cases I could think of and prompt GPT to give me working code. If a test failed, I would provide only the CLI output to GPT and try again.

Here is the test file:

type alias TestCase =
    { items : List (ItemWithSeats {}), expected : GroupedSeats, expectedHtml : List String }


itemsWithNumbers : TestCase
itemsWithNumbers =
    { items =
        [ { rowName = "1", seatName = "1" }
        , { rowName = "1", seatName = "2" }
        , { rowName = "2", seatName = "3" }
        , { rowName = "2", seatName = "4" }
        , { rowName = "3", seatName = "5" }
        , { rowName = "3", seatName = "6" }
        , { rowName = "3", seatName = "7" }
        , { rowName = "3", seatName = "8" }
        ]
    , expected =
        [ ( Row "1", "1,2" )
        , ( Row "2", "3,4" )
        , ( Row "3", "5-8" )
        ]
    , expectedHtml =
        [ "Row 1: Seats 1,2"
        , "Row 2: Seats 3,4"
        , "Row 3: Seats 5-8"
        ]
    }


itemsWithRowsAsStrings : TestCase
itemsWithRowsAsStrings =
    { items =
        [ { rowName = "A", seatName = "1" }
        , { rowName = "A", seatName = "2" }
        , { rowName = "B", seatName = "3" }
        , { rowName = "B", seatName = "4" }
        , { rowName = "C", seatName = "5" }
        , { rowName = "C", seatName = "6" }
        , { rowName = "C", seatName = "7" }
        , { rowName = "C", seatName = "8" }
        ]
    , expected =
        [ ( Row "A", "1,2" )
        , ( Row "B", "3,4" )
        , ( Row "C", "5-8" )
        ]
    , expectedHtml =
        [ "Row A: Seats 1,2"
        , "Row B: Seats 3,4"
        , "Row C: Seats 5-8"
        ]
    }


unnumberedSeatChart : TestCase
unnumberedSeatChart =
    { items =
        [ { rowName = "", seatName = "Sal 1" }
        , { rowName = "", seatName = "Sal 1" }
        , { rowName = "", seatName = "Sal 1" }
        ]
    , expected =
        [ ( Empty, "3x Sal 1" )
        ]
    , expectedHtml =
        [ "3x Sal 1"
        ]
    }


missingRowNames : TestCase
missingRowNames =
    { items =
        [ { rowName = "", seatName = "Sal 1" }
        , { rowName = "", seatName = "Sal 2" }
        , { rowName = "", seatName = "Sal 4" }
        ]
    , expected =
        [ ( Empty, "Sal 1,Sal 2,Sal 4" )
        ]
    , expectedHtml =
        [ "Sal 1,Sal 2,Sal 4"
        ]
    }


emptySeatLabels : TestCase
emptySeatLabels =
    { items =
        [ { rowName = "", seatName = "" }
        , { rowName = "", seatName = "" }
        , { rowName = "", seatName = "" }
        ]
    , expected =
        [ ( Empty, "3x" )
        ]
    , expectedHtml =
        [ "3x"
        ]
    }


notNumericSeats : TestCase
notNumericSeats =
    { items =
        [ { rowName = "1", seatName = "A" }
        , { rowName = "1", seatName = "B" }
        , { rowName = "2", seatName = "C" }
        , { rowName = "2", seatName = "D" }
        , { rowName = "3", seatName = "E" }
        , { rowName = "3", seatName = "F" }
        , { rowName = "3", seatName = "G" }
        , { rowName = "3", seatName = "H" }
        ]
    , expected =
        [ ( Row "1", "A,B" )
        , ( Row "2", "C,D" )
        , ( Row "3", "E,F,G,H" )
        ]
    , expectedHtml =
        [ "Row 1: Seats A,B"
        , "Row 2: Seats C,D"
        , "Row 3: Seats E,F,G,H"
        ]
    }


nothingIsNumber : TestCase
nothingIsNumber =
    { items =
        [ { rowName = "A", seatName = "Seat A" }
        , { rowName = "A", seatName = "Seat B" }
        , { rowName = "A", seatName = "whatever" }
        , { rowName = "B", seatName = "blah" }
        , { rowName = "B", seatName = "heh" }
        , { rowName = "C", seatName = "huh" }
        , { rowName = "C", seatName = "disabled" }
        , { rowName = "D", seatName = "enabled" }
        ]
    , expected =
        [ ( Row "A", "Seat A,Seat B,whatever" )
        , ( Row "B", "blah,heh" )
        , ( Row "C", "disabled,huh" )
        , ( Row "D", "enabled" )
        ]
    , expectedHtml =
        [ "Row A: Seats Seat A,Seat B,whatever"
        , "Row B: Seats blah,heh"
        , "Row C: Seats disabled,huh"
        , "Row D: Seats enabled"
        ]
    }


toHtmlConfig : { rowLabel : String, seatsLabel : String, toHtml : String -> String }
toHtmlConfig =
    { rowLabel = "Row "
    , seatsLabel = "Seats "
    , toHtml = identity
    }


suite : Test
suite =
    describe "Grouping seat labels"
        [ test "valid labels with numbers, multiple rows" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal itemsWithNumbers.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal itemsWithNumbers.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal itemsWithNumbers.expectedHtml
                    ]
                    itemsWithNumbers.items
        , test "valid labels with numbers and strings, multiple rows" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal itemsWithRowsAsStrings.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal itemsWithRowsAsStrings.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal itemsWithRowsAsStrings.expectedHtml
                    ]
                    itemsWithRowsAsStrings.items
        , test "missing row names" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal missingRowNames.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal missingRowNames.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal missingRowNames.expectedHtml
                    ]
                    missingRowNames.items
        , test "unnumbered seat chart" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal unnumberedSeatChart.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal unnumberedSeatChart.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal unnumberedSeatChart.expectedHtml
                    ]
                    unnumberedSeatChart.items
        , test "empty seat labels" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal emptySeatLabels.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal emptySeatLabels.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal emptySeatLabels.expectedHtml
                    ]
                    emptySeatLabels.items
        , test "not numeric seats" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal notNumericSeats.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal notNumericSeats.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal notNumericSeats.expectedHtml
                    ]
                    notNumericSeats.items
        , test "nothing is number" <|
            \() ->
                Expect.all
                    [ SeatLabelsGrouping.group >> Expect.equal nothingIsNumber.expected
                    , List.reverse >> SeatLabelsGrouping.group >> Expect.equal nothingIsNumber.expected
                    , SeatLabelsGrouping.group >> SeatLabelsGrouping.render toHtmlConfig >> Expect.equal nothingIsNumber.expectedHtml
                    ]
                    nothingIsNumber.items
        ]

And here is the prompt:

“Let’s try if you can handle this simple task. I have prepared a test file with test case scenarios and a file SeatLabelsGrouping with missing logic. Implement the missing logic to satisfy the tests. I am running the tests locally and can give you the results after your changes. All tests must pass. Do your best.”

It took several attempts before I got all the tests green and everything fully working, but in total it was not more than 30 minutes of interaction with the LLM. I also give it some basic file structure with initial types.

Types defined upfront:

type alias ItemWithSeats a =
    { a | rowName : String, seatName : String }


type alias GroupedSeats =
    List ( RowLabel, String )

Here is the code from GPT:

group : List (ItemWithSeats a) -> GroupedSeats
group items =
    let
        -- group seats by row while preserving the order of first appearance
        rows : List ( String, List String )
        rows =
            let
                grouped : Dict String (List String)
                grouped =
                    List.foldl
                        (\{ rowName, seatName } dict ->
                            Dict.update rowName
                                (\maybe -> Just <| seatName :: Maybe.withDefault [] maybe)
                                dict
                        )
                        Dict.empty
                        items
            in
            grouped
                |> Dict.toList
                -- already sorted by key ascending
                |> List.map (\( r, seats ) -> ( r, List.reverse seats ))

        seatsAllNumeric seats =
            List.all (\s -> String.toInt s /= Nothing) seats

        seatsToTokens seats =
            let
                ints =
                    seats
                        |> List.filterMap String.toInt
                        |> List.sort

                finalize run acc =
                    case run of
                        Nothing ->
                            acc

                        Just ( start, prev, len ) ->
                            if len >= 3 then
                                (String.fromInt start ++ "-" ++ String.fromInt prev) :: acc

                            else if len == 2 then
                                String.fromInt prev :: String.fromInt start :: acc

                            else
                                String.fromInt start :: acc

                step n ( run, acc ) =
                    case run of
                        Nothing ->
                            ( Just ( n, n, 1 ), acc )

                        Just ( start, prev, len ) ->
                            if n == prev + 1 then
                                ( Just ( start, n, len + 1 ), acc )

                            else
                                ( Just ( n, n, 1 )
                                , finalize (Just ( start, prev, len )) acc
                                )

                ( runFinal, acc0 ) =
                    List.foldl step ( Nothing, [] ) ints

                tokens =
                    finalize runFinal acc0 |> List.reverse
            in
            String.join "," tokens

        joinByComma seats =
            seats |> List.sort |> String.join ","

        formatEmptyRow seats =
            case seats of
                [] ->
                    ""

                first :: rest ->
                    if List.all ((==) first) rest then
                        if first == "" then
                            String.fromInt (List.length seats) ++ "x"

                        else
                            String.fromInt (List.length seats) ++ "x " ++ first

                    else
                        seats |> List.sort |> String.join ","

        formatRow ( rowName, seats ) =
            if rowName == "" then
                ( Empty, formatEmptyRow seats )

            else if seatsAllNumeric seats then
                ( Row rowName, seatsToTokens seats )

            else
                ( Row rowName, joinByComma seats )
    in
    List.map formatRow rows


render : { rowLabel : String, seatsLabel : String, toHtml : String -> html } -> GroupedSeats -> List html
render ({ rowLabel, seatsLabel } as cnf) groupedSeats =
    List.map
        (\( row, seats ) ->
            case row of
                Empty ->
                    cnf.toHtml seats

                Row r ->
                    cnf.toHtml (rowLabel ++ r ++ ": " ++ seatsLabel ++ seats)
        )
        groupedSeats

The code was working, and I was ready to push it to production. But since I had some extra time, the temptation to try it myself was enough to actually do it. Without diving too much into the GPT code, I tried to come up with my own simple solution. Nothing fancy — just clear for a human reader. It took me around an hour, maybe slightly more (I was distracted by construction workers several times).

Here is my take:

group : List (ItemWithSeats a) -> GroupedSeats
group items =
    let
        sortSeats : List (ItemWithSeats a) -> List (ItemWithSeats a)
        sortSeats =
            List.sortBy
                (\{ rowName, seatName } ->
                    rowName ++ seatName
                )

        prepareNames : List (ItemWithSeats a) -> List ( RowLabel, String )
        prepareNames =
            List.map
                (\item ->
                    ( if String.isEmpty item.rowName then
                        Empty

                      else
                        Row item.rowName
                    , item.seatName
                    )
                )

        isNextInLine : String -> String -> Bool
        isNextInLine a b =
            case ( String.toInt a, String.toInt b ) of
                ( Just n1, Just n2 ) ->
                    n1 + 1 == n2

                _ ->
                    False

        mergeLine : List String -> List String
        mergeLine continuousLine =
            if List.length continuousLine > 2 then
                [ String.join "-" (List.take 1 continuousLine ++ (List.take 1 <| List.reverse continuousLine)) ]

            else
                continuousLine

        groupSeats =
            List.gatherEqualsBy Tuple.first
                >> List.map
                    (\( ( row, seat ), rest ) ->
                        if row == Empty then
                            ( row
                            , List.group (seat :: List.map Tuple.second rest)
                                |> List.map
                                    (\( first, sameSeats ) ->
                                        if List.length sameSeats > 0 then
                                            let
                                                spacer =
                                                    if String.isEmpty first then
                                                        ""

                                                    else
                                                        " "
                                            in
                                            String.fromInt (List.length sameSeats + 1) ++ "x" ++ spacer ++ first

                                        else
                                            first
                                    )
                                |> String.join ","
                            )

                        else
                            ( row
                            , rest
                                |> List.foldl
                                    (\( _, item ) ( previous, accInLine, acc ) ->
                                        if isNextInLine previous item then
                                            ( item, accInLine ++ [ item ], acc )

                                        else
                                            ( item
                                            , [ item ]
                                            , acc ++ mergeLine accInLine
                                            )
                                    )
                                    ( seat, [ seat ], [] )
                                |> (\( _, inLine, acc ) ->
                                        acc ++ mergeLine inLine
                                   )
                                |> String.join ","
                            )
                    )
    in
    items
        |> sortSeats
        |> prepareNames
        |> groupSeats

I’m not drawing any big conclusions here — just wondering if anyone else is using TDD and LLMs for Elm with some level of success. In the end I’ll probably use my code, just because to me it feels more readable. Also, GPT wasn’t great at picking up some non-core libraries, which gave me an advantage in making the code smaller.

I didn’t perform any benchmarks, but if there’s interest, I could try to compare the performance of my code vs GPT’s. Maybe even try prompting GPT to optimize for performance versus readability.

Thanks for reading — I’d love to hear your takes!

dirkbj · September 3, 2025, 2:48pm

We’ve not take this approach at XetiCode - we typically use BDD, but this is very interesting. The TDD approach works very will in some domains, like writing rules-sets. Would love to see more on this.

janjelinek · September 3, 2025, 3:17pm

I was thinking about BDD as well, but this was simpler to put together with easy way to run test and get verification of LLM code. There are most likely already solution where you just run it and LLM gets directly tests output so it’s looping automatically until it’s green. But my main goal in that was test LLM capabilities with ELM, because for autocomplete is kind of ok when it’s just small piece of code (copilot).

Not sure if I will have much opportunities to try and explore this more, but if yes I will post results for sure!

system · September 13, 2025, 3:18pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Your problems writing Elm code with AI/LLM Request Feedback	9	1009	March 13, 2024
Idea: add statistical labeling to elm-test Request Feedback	22	2008	June 1, 2019
Experience with AI: Claude Engineer Request Feedback	10	677	August 4, 2024
Elmer: An Elm Testing Tool Show and Tell	24	6227	January 23, 2018
Copilot and ChatGPT + Intellij + One Human Request Feedback	3	974	October 11, 2023

Small experiment with LLM and TDD

Related topics