Proposal: Using modules like a record

In functional programming, it is typical to write patterns for calling functions, instead of just calling them. Example: List.map

map : (a -> b) -> List a -> List b

Apply a function to every element of a list.

map sqrt [1,4,9] == [1,2,3]

map not [True,False,True] == [False,True,False]

Here, instead of [not True,not False,not True] or [sqrt 1,sqrt 4,sqrt 9], we use map to reuse the pattern of calling a function on each element of a list and putting the result in the new list. The key technique here is that we get to choose to use not or sqrt or squareThenSubtractOneThenDivideByOneGreaterThanTheOriginalNumberThenAddOne or what have you. map wasn’t written to act on data directly but rather to shuffle numbers around in the right way according to some often-enough-used pattern. We decouple the generic use-case from the operation. This technique is really fascinating to functional programming nerds such as m’self, because what is a really painful and stupid process in OOP is as easy and simple as adding another parameter in FP. This is one of the many reasons why I love Elm and think it’s worth sharing!

Of course, there’s no reason to stop at map, we also have the functions List.foldl, List.sortWith, Dict.merge, Maybe.andThen, the List goes on (see what I did there?). The way to spot them is to find a function that takes at least these parameters:

  1. A function, and
  2. Some data you could call that function with

Notice something interesting. List.map takes one decoupled function, but Dict.merge takes three! How many can you have? Probably quite a few, but there’s no reason to have a tediously long type signature for data you reuse often enough to warrant the abstraction. Instead, we can use a record, which gives us the benefit of having named fields. For example, let’s say we have a Vector module:

-- In the Vector module

type Vector
    = Vector Float Float

add : Vector -> Vector -> Vector
add = . . .

negate : Vector -> Vector
negate = . . .

normalize : Vector -> Vector
normalize = . . .

 -- In some other module where you use the Vector module

type alias VectorOps =
    { add : Vector -> Vector -> Vector
    , negate : Vector -> Vector
    , normalize : Vector -> Vector
    }


vector : VectorOps
vector = 
    { add = Vector.add
    , negate = Vector.negate
    , normalize = Vector.normalize
    }

So here I have three ops, add, negate, and normalize, but you can imagine that for some complex forms you could have save, send, validate, review, share, autofill, and whatever else your heart beckons for.

This is pretty powerful on its own; now you can use the Vector module’s datastructure in patterns that take any combination of these vector ops. Consider Addable:

type alias Addable ops datastructure = 
    { ops
    | add : datastructure -> datastructure -> datastructure
    }
    
doThingsToSomethingYouCanAdd : Addable ops ds -> ds -> ds -> otherStuff -> etc -> . . .
. . .

By calling doThingsToSomethingYouCanAdd vector myVectorA myVectorB otherstuff . . ., you have decoupled a function that adds data. Congratulations! This function does “something” with a datatype where it makes sense to add them together (as opposed to a datatype where it doesn’t, such as VectorAndScalar, or Fruit, i.e. what is Apple + Orange?). Fill in “something” with “sum a list” or “muliply” or whatever. Just as multiplication follows from addition, subtraction follows from having both add and negate. I’ll leave it to the reader to come up with how to write AddableAndNegateable ops ds–let’s move on.

So this is the clincher: you need to fill in an ...Ops-type record, like VectorOps, PerlinNoiseMapOps, MassInNonEuclideanSpacetimeOps, FruitThatCanSomehowBeNegatedOps, and so on for every datatype module that is Addable or AddableAndNegatable or NegateableAndNormalizeable, etc. And because you’ve chosen to abstract it this far, there are probably quite a few such modules. This means doing this pattern again and again:

vector : VectorOps
vector = 
    { add = Vector.add
    , negate = Vector.negate
    , normalize = Vector.normalize
    }

perlinNoise : PerlinNoiseMapOps
perlinNoise = 
    { negate = PerlinNoise.negate
    , normalize = PerlinNoise.normalize
    }

sillyMass : MassInNonEuclideanSpacetimeOps
sillyMass = 
    { add = MassInNonEuclideanSpacetime.add
    , normalize = MassInNonEuclideanSpacetime.normalize
    }

fruit : FruitThatCanSomehowBeNegatedOps
fruit =
    { negate = Fruit.negateSomehowIdk
    }

This is tedious. Unfortunately, no pattern can solve this because it is a technique leveraging Elm’s type system rather than performing some kind of computation. Finally, 5031 characters in, I get to my proposal: Elm could benefit from having syntactic sugar that writes these records for the user based on the module definition instead of them having to repetitiously do this manually. This makes writing Elm a more pleasant experience for this user.

To specify, say we have this module, likely centered around a datastructure:

module MyModule exposing (..)

. . .

functionA : a -> b -> c

. . .

functionB : d -> e -> f 

. . .

The only functions in MyModule are functionA and functionB. I propose some shortened syntax to write this:

mymodule = 
    { functionA = MyModule.functionA
    , functionB = MyModule.functionB
    }

adapted accordingly to however many functions MyModule has. In this way, we use a module like a record to solve the problem of many large, alike, decoupled datastructures. I believe @evancz and his team is capable of coming up with a nice design for this syntax, but just for the sake of being concrete I’ll show what I have in mind.

At the module import, there is a new keyword, fitting, that works quite like exposing.

import MyModule as M 
    fitting 
        ( mymodule :
            { functionA : a -> b -> c
            , functionB : d -> e -> f 
            }
        )
    exposing (foo, bar, baz, quux)
  • mymodule’s properties are “fitted to” MyModule.functionA and MyModule.functionB if they’re exposed from MyModule
  • The type signature here is optional, but recommended: we wouldn’t want the creator of MyModule to go and change its names and types on us without us knowing!
  • I don’t expect this to be used for many exposed modules, just internal ones. If any exposed modules do this, they probably aim for extensibility
  • As of yet, I don’t know of a practical use for the cases below. I hope commenters will tell me what they think of them!
    • Fitting a type that isn’t a record, like a Float or a union type. Though, I figure an extensible record could be useful
    • Fitting multiple type declarations, with mymoduleA : { . . . }, mymoduleB : { . . . }, etc.

================================================================================
Footnote:

I get that Elm’s general policy is to be meticulous when adding features because adding features adds more stuff new users need to learn to use Elm. Here, @luke has said that he evaluates features like so:

To what greater degree does the feature enable an Elm user to achieve the goals Elm proposes to help with beyond what is already possible?

In this interview @evancz has stated Elm’s goal to “provide a pleasant programming experience […] how can we make it possible to do that and for you to have fun doing it?” I argue that this feature, while putting hardly any new mental burden on users, still promotes the fun, expressive, and collaborative tooling Elm aims for because of the following reasons:

  • The parallel between records and modules is already demonstrated in the syntax by the namespacing (.) operator
  • The reduced redundancy allows for datatypes’ features to be implemented in linkable pieces like Lego bricks instead of case-by-case, making the code more concise and reducing naming inconsistencies
  • Modules-as-records reduces gratuitous abstraction and focuses on the reuse that actually matters via the option to enforce the type signature

================================================================================

Thanks for reading! Let me know what you think of this idea.
anon5324703

2 Likes

I recommend looking into OCaml and Adga, which both do something like this.

I think the interesting aspect of these systems is that they require complexity in the type system that is quite surprising. In OCaml you can instantiate modules dynamically, even giving them arguments, so you can create different versions. But now, if a module creates a type how do we know it is the same as in other instantiations of the module? Should it be? Should it not? Creating new types dynamically also starts breaking the border between values and types. In Agda, a dependently typed language, this is less of a problem, but it does not seem “simple” to me at least.

Putting the type problems aside, I suspect it makes it harder to do per-function dead code elimination. I’m not 100% certain about that, but at the very least, it would complicate a bunch of optimizations.

So I think it’s an idea that sounds cool. I have thought about things like this as well. It’s not just about syntax though. It creates some very hard problems. So I recommend looking for languages that have this and seeing what issues it causes.

3 Likes

Ur/Web is also a very interesting language to have a look at

http://www.impredicative.com/ur/tutorial/tlc.html

I guess the thing evancz is referring to is called a “Functor” in the SML/OCaml land.

Yeah, its type system seems to resemble Haskell. Thanks for the recommendation! I’ll be sure to look into it further sometime soon.

That’s interesting, it seems like this proposal inadvertently ran into problems shared with type class systems. Yikes!

You gave quite a few interesting points!

  • I recommend looking into OCaml and Adga

Thanks for the recommendations! I’ll definitely read up on those–I think I’ll use OCaml for one of my classes at some point. Looks really intriguing!

  • if a module creates a type how do we know it is the same as in other instantiations of the module?

I don’t quite understand what you mean by this, because modules don’t create their own types in this system; e.g. MyModule doesn’t create the type

{ functionA : a -> b -> c
, functionB : d -> e -> f 
}

and PerlinNoise doesn’t create the type PerlinNoiseMapOps, and so on. These are defined by the user. I wouldn’t quite classify this as “creating new types dynamically” but rather “assigning values to a member of that type dynamically.” Could you explain what you mean by that?

EDIT:
Here’s one thing that came to mind: Do you mean how do we know the Vector module can be fitted to member of the type VectorOps, PerlinNoise can be fitted to PerlinNoiseMapOps, etc.? Because a property of the record has the same name and a correct type. I don’t have any math to prove that this system is 100% deterministic, but my intuition says so. But there’s no doubt that I’ll check out OCaml and Agda to see what caveats there are to this.

  • I suspect it makes it harder to do per-function dead code elimination

I don’t know if Elm has any level of “nested record dead code elimination”, but if it did it would be quite complex indeed. This would certainly be a feature that requires some wisdom to implement, I don’t know if that puts too much optimization burden on the Elm user or not. It might help to forego this:

The only functions in MyModule are functionA and functionB

so that you can still dead-code-eliminate the other functions you don’t use (if you read closely, you see that MyModule also has the functions foo, bar, baz, and quux. That was an error on my part, but not anymore!). Either way, this is practice is not for every function of every module, that’s for sure.

  • So I think it’s an idea that sounds cool.

Thanks! I don’t know why, but Elm above all other languages invokes my sensibilities as a language designer. There are clearly a lot of difficult concepts to juggle when making these things such as those you mentioned, and I think you and your team are doing a great job! Keep up the good work!

I think the thing you are describing is “first-class modules” which is like parameterizing a module by another module.

I think this page talks about modules in Agda in a way that may be a decent preview. The part about “Parameterised modules” is where you start creating things dynamically.

If you want to get deep into details, I know that F-ing Modules is a commonly recommended paper on the topic.

I also want to emphasize that in language design, a very small percentage of ideas that sound cool actually end up working well when you get into all the details. So I think I make a distinction in my mind about sounds good vs is good when it comes to language design, because neither implies the other unfortunately!

For sure! Thanks for the resources!

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.