Where is Elm going; where is Evan going

My view on the role of Evan is - if he wanted to join this discussion he would, and should be welcome to. But we can’t just map out a load of work and hope that “someone on the internet” will pick it up and do it for us. In fact, he has previously stated his annoyance at people telling him what he should be doing all the time. So it is best if we just respect that.

In my experience open source is decided by those who do, not those who sit on the sidelines and comentate. So if we want these things to be done, just make a plan and get on with doing them, but don’t factor Evan into those plans unless he is actively seeking to be involved in them.

You may have missed this Gordon, but No Red Ink is no longer the #1 user of Elm. They laid off a lot of people for accounting reasons ahead of a funding round, and Evan was dropped by them at that point - essentially defunding the Elm compiler. Probably the #1 user of Elm now is Vendr.

Do we need a leader to coordinate this? It helps, but also people are chipping away at some of these issues independantly and without being blocked on each other. It certainly will take a few people with the skills and time and motivation to bring something together though.

I am enjoying this hypothetical compiler design thread though. :slight_smile:

5 Likes

A paper on efficient functional unification:

If the technique can be implemented in Elm, worth benchmarking it.

1 Like

@Lucas_Payr,

Thanks for this input. It’s not a deal breaker for me as it seems that preserving current Elm may be less and less of a goal and if it seems that future general-purpose Elm would benefit from self-hosting, it may also have some form of side-effect controlled mutation.

Or one may be able to get “enough” although not as efficient mutation through some other means?

@rupert,

The Haskell Graph package is used for analysis of Strongly Connected Components through edge analysis between defined nodes, and is used by the current Elm compiler to automatically find cases of recursion in three places: forbidden recursive imports, recursive local references that are forbidden, and for recursive forms that are acceptable.

It can also be used to represent the data flow graph topographically and I am thinking of using it for data flow analysis to determine when complex data structures escape their creation or argument scope and thus need to be boxed on the heap and reference counts applied.

By linear arrays I mean contiguous arrays in memory that have a means of mutating the contents as in Haskell’s STArray’s, not linear types which I wouldn’t want to add.

As to if the “unsafe” Haskell functions: unsafeThaw, unsafeFreeze, unsafeRead, and unsafeWrite, can be avoided and still have efficient code, I’m not sure yet. Richard Feldman of the Roc language did a “QuickSort” study that seemed to show optimizations based on uniqueness of the data structures can be almost as fast as mutation in imperative code but I think the benchmark wasn’t fine grained enough with too many missed branch predictions in all the languages implementation to show the real differences. I just wrote a bit-twiddling Sieve of Eratosthenes prime number benchmark in Roc in the same style as that benchmark that runs on imperative languages like Nim, Rust, C, etc. in about 1.5 CPU clock cycle per composite cull operation but takes about 30 cycles in Roc even with optimization turned on; however, I may by using the Roc compiler incorrectly and/or all of the proposed optimizations may not be enabled as the current Roc compiler is very much a WIP.

I suppose I should post my benchmark to Richard to see what he makes of it…

@rupert,

Yes, good advice on respecting Evan’s wishes.

No Red Ink is no longer the #1 user of Elm. They laid off a lot of people for accounting reasons ahead of a funding round, and Evan was dropped by them at that point - essentially defunding the Elm compiler.

That is indeed news to me, and explains quite a few things about Evan’s talk and approach to further work.

In my experience open source is decided by those who do… So if we want these things to be done, just make a plan and get on with doing them…

Again, a good point, and given your news above makes it feel even more acceptable to fork or clone the current Elm compiler and core packages and just get on with it, perhaps also trying to incorporate Evan’s current project if it becomes public, with due copyright notices for all, of course.

Do we need a leader to coordinate this? It helps, but also people are chipping away at some of these issues independantly and without being blocked on each other. It certainly will take a few people with the skills and time and motivation to bring something together though.

“Chipping away” will always be done; “bringing together” is an entirely different skill set. As I advance in age, I think I’m perhaps better suited to the latter than the former, especially when projects extend to more than a month or two.

There is also the question on whether it would be better to join Richard Feldman’s Roc language effort as contributors. Roc is definitely a functional language with many of the same principles as Elm, with the question “Can one live with the differences or is it worth the effort to extend Elm toward more independent goals?”.

I am enjoying this hypothetical compiler design thread though.

As I am :slight_smile:

1 Like

@rupert,

A paper on efficient functional unification:

If the technique can be implemented in Elm, worth benchmarking it.

Yes, I had a quick scan of the document and it says one can implement it without IORef so might solve the UnionFind problem in Type Inference…

I’m not interested in roc. It seemed to start out as Elm but more general purpose which was good. But the syntax changes like changing how pipes worked just seemed arbitrary to me, and then the addition of language features such as extensible union types and how it does type classes put me off altogether. It seemes to have lost the simplicity of Elm for the sake of getting too clever, which I guess was too much of a temptation for its authors to resist.

I love that you can fit all of the Elm syntax on a single slide in a presentation.

Already I have used Elm to write applications serving HTTP, and cli tools. It does present some challenges in those situations, but I really would like an Elm that better supports them. Also scientific/numerical/ai computation - dealing efficiently with vectors, matrices and tensors could be wonderful in the right FP language.

2 Likes

Thank you for the kind words. For a moment I feared that I was hijacking your thread.

Totally agree. I was implying that it would be a dick move pretty bad thing to do, and not respectful to Evan’s work.

At this point, I am willing to do this without requiring Evan to pitch the future of Elm to us.

Ostensibly, he would be compensated for the great work that he has already done and the value we got from Elm-as-it-is, so that he is free to do what he wants for a little while.

Secretly, I hope to give him enough leverage to make Elm something great and sustainable. I even have OpinionsTM on this matter, but I consider them off-topic for now.

I agree that it is a gamble, I must say that I am pleasantly surprised that you are open to taking it too!

What we should ask Evan is “Would you like some money from us?”. He might be uncomfortable receiving it, thinking it will bring pressure on him, or that people will get entitled after donating. To me, the goal is a true “no string attached” donation from the community to the Elm Foundation, but we can’t be sure that nobody will try to bargain for output or features afterwards.

We would also have to sort out the practical aspect with him. The Elm Foundation doesn’t take online donations from individuals as @Lucas_Payr pointed out. Here is old but related info I dug up, someone let me know if it’s outdated:

1 Like

@axelbdt,

At this point, I am willing to do this without requiring Evan to pitch the future of Elm to us.

Ostensibly, he would be compensated for the great work that he has already done and the value we got from Elm-as-it-is, so that he is free to do what he wants for a little while.

Secretly, I hope to give him enough leverage to make Elm something great and sustainable. I even have OpinionsTM on this matter, but I consider them off-topic for now.

That’s fair enough and I see your point as to gratitude and faith that Evan might come up with something innovative, given freedom from economic concerns…

I agree that it is a gamble, I must say that I am pleasantly surprised that you are open to taking it too!

Remember, that is why I started this thread: to determine the Elm language’s current state, which as per Rupert’s catching me up on events over the last couple of years seems to be "mostly stagnant, with little to no incentive for Evan to further its development; to try to determine where Evan is going, which is answered as no one really knows other than the references to this new “PostgreSQL tables in C project” from his talk linked in the OP.

Which leads us to the question not in the title of this thread but implied by my ideas of the OP “So where do we go from here?”, to which you have replied with your action item to try to help Evan without influencing him to see where that will go, which I (and I hope others) will support.

We also have Rupert’s suggestion that there are some stubs of projects in existence to continue in the spirit of Elm and using Elm syntax that perhaps could be unified to become a new Elm-like language and community. Other than your updates on your action item, I suggest that the rest of this thread be used to explore what we might do ourselves to further an Elm-like but more general purpose language as I have proposed which would not be Roc. I recognize that there may be projects that are not yet published such as my own self-hosting Elm efforts that authors may not be ready to reveal yet, and welcome PM’s to me privately so I can collate such efforts with a view to the future. For my part, with gained motivation from this thread, I will do some more explorations on my self-hosting Elm but concentrate on investigating overcoming the possible restrictions as have been revealed here in only using persistent data types without mutation (although that may not be a restriction for the eventual Elm-like language).

Please keep us posted on your investigations…

@rupert,

Well, it seems that the changes made weren’t entirely arbitrary as there were stated reasons for them, just that we may not agree with those reasons. I regard Roc as quite an opinionated language (just as are Rust and Zig in which it is written) and the opinions of its authors (primarily Richard Feldman, I believe) seem to be that the world needs a mandated-safety FP language without even the options of bypassing overflow-and-bounds-checking or any “unsafe” forms (as most general-purpose languages such as even Rust and Haskell have), with the view that the compiler can certainly be made smart enough to bypass those conditions when given the right set of conditions such as mutation in place when it is never visible from the outside. However, that is something like depending on a compiler to elide bounds checks and requires trial and error or completely understanding the compiler’s set of rules for when this can be applied, and there still being use cases where one can’t trigger these rules even though the bounds checks aren’t necessary.

I do generally enjoy Richard Feldman’s ability to do a presentation in his rapid-fire but extremely well organized manner that I don’t find boring as so many presentations are.

As to his Abilities implementation, I have proposed a Capabilities implementation for extended-Elm that is not that different (but likely less complete as I haven’t been working on it as long) as Roc’s Abilities. What do you dislike about the Ability specification other than it seems to go beyond a simple requirement for being able to more easily generalize the functions that are assumed to be available to a Type?

As to his extensible custom type’s, I agree with you that they seem overly complex. While there have been negative comments on Elm’s record syntax and Evan’s pretty much failed attempt to make records extensible that perhaps should be cleaned up a bit more, but record syntax generally does the job required of it and this extensible stuff seems like overkill for a simple language.

Yes, I see that a Elm-like language with a C back-end (at least as an option) could be a very powerful tool and could be very powerful even as to being capable of being a systems language but still with the safety and conciseness of FP. I am basically a low level guy and see implementation details from the “bottom up” that lead me to believe that one can have an Elm-like syntax yet still have the performance of C. There is a major split in how to implement this, though, ini trying to make everything purely appear safe with only the compiler able to incorporate shortcuts when it can determine it is safe to do so to providing “unsafe” functions that bepass the safety; my tendency is to so the latter but I do admire the former if it is possible…

I kind of like the Roc idea that an application is built for different “Platform’s” where the packages avaiable for different platforms may differ, so something build to support web pages would be different than a server application than a CLI application, etc. What do you think?

? Elm has extensible records. I really like Elms extensible records, and consider it to be one if its strengths. Particularly as a technique for flattening application data models to avoid the awkwardness that results from encapsulation. An example:

module Main exposing (..)

type alias Model =
    { a : Module1
    , b : Module2
    }

versus:

module Main exposing (..)

type alias Model =
    { 
    -- Stuff used by module 1
    a : String
    , b : String
    -- Stuff used by module 2
    , c : Int
    , d : Int
    -- Stuff used by both
    , e : Float
    }

module Module1 exposing (..)

type alias Module1 a = 
    { a |
      a : String
    , b : String
    , e : Float
   }

This avoid what often happens in practice, is that you are working on component1 then realize you actually need a field from component2 (and sometimes you see silly things happen with out messages to request and return the encapsulated field from 2!!). Flat model means everything is available, you just use extensible records to define the minimal slice each module needs.

What mode is needed from extensible records? They don’t let you do subtyping, but I also think that is not a bad thing.

Subtyping could be enabled by allowing existential types in Elm possibly, and I don’t even think that would break the type system - although it might make type inference undecidable.

1 Like

Elm had extensible records from 0.7 to 0.15, they were removed in 0.16.

Just looked at the docs, I think I thought it was hackier than I remember - I may have seen an earlier version of it. Actually, I think that looks like a very clean and straightforward implementation of typeclasses and would probably work very well with Elm.

Are you referring to this?

myFun : { val : String } -> { val : String, len : Int }
myFun rec =
    { rec | len = String.length rec.val }

But in Elm you need to do:

myFun : { val : String } -> { val : String, len : Int }
myFun rec =
    { rec 
    | val = rec.val -- Copy fields from the source
    , len = String.length rec.val 
    }

I agree. The removal of that is slightly painful. Especially as it is not uncommon to have an accumulating state machine during application initialization - something that works through a series of operations to fetch network data, get the screen size, etc, and accumulate fields. As more and more fields accumulate, the copying bit gets longer and longer.

On the other hand, Evan made a good point about having records that you can add new fields to is not so great for performance. I think it may not matter so much when compiling to javascript, but for a compiler targetting other backends knowing the shape of data and its exact memory layout really helps performance. Perhaps this is good enough reason not to allow it?

Note about performance here: Proposal: remove record syntax for field addition and field deletion · Issue #985 · elm/compiler · GitHub

I notice that functions over records in Roc will always accept args with more fields: Roc Tutorial. So possibly a performance limitation there, unless the compiler is going to do something clever and recognize the record shapes that are actually used and do some monomorphization.

The other major area of untidiness in Elm is the broken math stuff:

> x : Int
| x = 2 ^ -1
0.5 : Int

> modBy 0 3
Error: Cannot perform mod 0. Division by zero error.

and others around how ints are really floats, such as the bit shifting operators only acting on the first 32 and so on.

^ is a tricky one, since its type signature is: ^: number -> number -> number, meaning we can’t fix the type error there in a backwards compatible way, unless we make it a runtime error.

The number system in Roc expands on what kinds of numerics we can have, so you can have U8, I32, F64 and so on. I think it generates runtimes on math errors such as divide by zero, rather than the (incomplete) Elm way of giving a wrong answer instead of a runtime.

Not sure if a proliferation of number types like Roc is a good idea or not. Perhaps Elm might be improved by making Int an arbitrary precision integer implementation, and keeping Float standardized on F64 (IEEE except that different browsers handle that differently).

What are your thoughts on a number system for an ideal Elm? and how we can get there with minimal disruption.

1 Like

Having a consistently shaped object for JS can be a performance gain. Some of the “fixes” in elm-optimize-level-2 are just adding consistent fields to the compiled JS, even if the value is null.

1 Like

@rupert,

To jump to the end of the comments on this chain of thought, calling what Elm has now as “extensible” records should probably really be called “flexible records” with the whole “extensible records” definition having been removed; I agree that “flexible records” can be quite a useful feature, even if they have been de-emphasized in the documentation.

Roc’s “open” and “closed” types seems to be an attempt unify this idea across both what we in Elm would call “record’s” and custom “Type’s” whole merits probably are worth discussion as whether these ideas should be included in a new Elm-like language.

@rupert,

Yes, as I said, it is very similar to a specification of “Capabilities” I was working on, and whle neither specification may be perfect, I think some merged and improved version may be just what is needed. However, unlike Roc, I desire that the new Elm-like language be backward compatible with Elm, so there the specification needs to fit in with Elm’s current pseudo type families of number, comparable, appendable, and comappend, etc., but I think that the Roc model (and mine) can support that.

@rupert,

As per your examples and many more, yes, these things should have been fixed many versions ago and the pain of the breaking changes undergone there. This is at least one area that any new language SHOULDN’T be backwards compatible with current Elm.

the bit shifting operators only acting on the first 32 (bits - GBG)

I think this was an error in specification on Evan’s part, as integers that are treated as true 32-bit integers as in taking four bytes in all current JavaScript engines are the only Int’s and Evan’s sometimes treating JavaScript 64-bit floating point numbers as Float’s and sometimes as Int’s is inconsistent when used by JavaScript bitwise operations which only work on 32 bits.

F#/Fable, which support the whole gamut of integer and float types, gets it right although it doesn’t have type classes and only somewhat override-able operators, in that it matches on all the available types, BUT it requires that all numeric type literals other than default float64 numbers with a decimal point and int32 numbers without a decimal point have a number suffix to match the required operator type. With this, re-factoring to a different number type is tedious in having to change all the suffixes. Fable also supports 64-bit integers (and maybe 128-bit ones to come) by emulation in JavaScript, and also has a BitNum integer type that can be sometimes useful; Roc doesn’t have the latter, at least not yet.

Haskell also supports all the number types and type classes, but whenever a required type doesn’t match the inferred type, one has to type match by the use of frequent fromintegral numeric casting functions.

Roc is more similar to Haskell than not in this regard, with the full range of numeric types that are generally automatically inferred by preference but there are casting functions available to be able to mix types AND also numeric type suffixes to by able to force a type when desired (not recommended in the documentation). I think this is a workable solution.

Not sure if a proliferation of number types like Roc is a good idea or not. Perhaps Elm might be improved by making Int an arbitrary precision integer implementation, and keeping Float standardized on F64 (IEEE except that different browsers handle that differently).

If one wants a general purpose language, then I don’t think there is any choice but to support the proliferation of types, and by making float64 the default floating point and int32 the default integer, it is as compatible as possible. This works for F#/Fable.

how we can get there with minimal disruption.

There will be some breaking changes adopting the above, but not too many in production code as THE EXISTING ELM SPEC IS NOT CORRECT. A new Elm-like language must be mathematically correct and type sound, which current Elm is not in this regard.

Oh I see. So given all concrete records that are used where an extensible record is allowed, you can take the superset of all fields involved, and then pad that out with nulls, so that the same shaped record is used in each case.

Seems like if you did that, potentially the num fields could grow pretty large, especially if someone deliberately created a nasty corner case, which is probably not really likely in practice. Often when we put an extensible record in a type sig, we might end up only calling it with a single record kind.

I have been thinking about this particular issue - how to compile extensible records into efficient code. Was thinking monomorphisation. Another idea was pass in a record of pointer offsets, that the code uses to find the fields in whatever concrete record arg you give it - sort of like how a virtual method table works in OO languages. Means 2 memory accesses to get to a field though - monomorphizing would flatten that to 1.

1 Like