Help me limit the List.Extra.cartesianProduct results count


#1

I’m feeding List.Extra.cartesianProduct with quite large datasets, that sometimes output 500K results, causing quite the CPU holdup. Only around 100 of the results are useful for my usecase.

Could somebody help me limit the number of results List.Extra.cartesianProduct iterates for?

I tried this:

cartesianProductWithIterationLimit : Int -> List (List a) -> List (List a)
cartesianProductWithIterationLimit limit pendingItems =
    let
        doIt collection slotsCount =
            case collection of
                [] ->
                    [ [] ]

                xs :: xss ->
                    if slotsCount > 0 then
                        lift2 (::) xs (doIt xss (slotsCount - 1))

                    else
                        collection
    in
    doIt pendingItems limit

But it’s not good enough, it only limits the iterations count for the outmost “loop”.


#2

If you only want a subset of the results, it would be helpful to know which results you want more than others. Is it okay if the last element in each returned list is the same? Are you giving the function a large number of small lists, or a small number of large lists?

The other strategy is to make the function lazy, perhaps using this library, and then generate only as much as you need.


#3

If you only want a subset of the results, it would be helpful to know which results you want more than others. Is it okay if the last element in each returned list is the same? Are you giving the function a large number of small lists, or a small number of large lists?

It’s all irrelevant in my usecase, because any result list longer than 100 means that the function is being abused instead of used.

Can’t use lazy since the input changes constantly.

Let’s look at the feature: it would allow users to use alphabet for searching, that is different from the alphabet used for the content itself. Example:

For an input of "gotvac" the result is ["готвац","готвач","ѓотвац","ѓотвач"].
But if somebody enters "cccccccc" the result length is 256, and it exponentially grows.


#4

What about first calculating how long result list would be, and if it would be over e.g. 100 items, then don’t create the list at all and just return nothing?


#5

Laziness is not memoization. Return a lazy list (of lists) instead of a list of lists. Then take 101 items and if you actually get 101, then you know the function is “being abused”.

Although, @malaire’s solution is probably simpler if it will work for you.


#6

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.