I was playing with Purescript lately and discovered that they allow to use Semver with 0.x.y versioning in package manager. That’s really great when you’re still working on your library because you’re more likely to change your API.
For instance, when working on elm-bodybuilder/elegant, we changed our API very often, but we still needed it in our projects, so we pushed it to elm-package lots of times. If we had 0.x versioning, we could be at 0.12 and, when stable, be at 1.0.0 version! That would be great!
Would you use that feature if it was implemented in elm-package?
My assumption is that part of your suggestion is that 0.x releases would not be under the standard versioning policy constraints (major bump for breaking change, minor bump for addition). In such a scenario, you could remove an exposed function or change its parameters and not have to do a major version bump while your package is pre-1.0.
I do not think this would be a good idea. I can easily imagine a scenario where someone creates and releases an “alpha” version of their package with “NOT FOR PRODUCTION USE” and all such normal disclaimers.
Except that people do not listen (see: all the posts about native-modules if you want examples of people building production software out of not-for-production features). After the author shares the package, people like it and it gets a lot of downloads. Some people use it in personal projects. Others use it in production. The author gets feedback and wants to update the package with a breaking API change. Now what?
If he doesn’t bump the version to at least 1.0.0, the update may break numerous applications out there. This is because elm-package cannot guarantee that the 0.x API is still the same as before. Now there is a problem. The author wants to change the API. Consumers using the package as a dependency do not want broken software. The benefit of the elm-package is lost.
Worse yet, some package authors could just abuse the system keep their packages in 0.x forever, to never have to face the versioning policy constraints. Again, production software would get built on top of these packages, and again, the benefit of elm-package would be lost.
I’m not very familiar with elm-package, but I am wondering if this is more about the version ranges used in elm-package.json files. If I add version 2.1 of a package to my app, elm-package automatically adds an entry with range 2.1.0 <= v < 3.0.0, right? Wouldn’t it be enough if elm-package automatically added an entry with range 0.17.0 <= v < 0.18.0 for a package with version 0.17? Basically that’s the way of working with SemVer I know from Ruby Gems.
Or another way to put it: Is there something stopping me from manually creating an elm-package.json entry that spans multiple (probably future) major versions, which might break my app once they are released?
Whether it is desirable to have a lot of 0.x packages in the ecosystem is of course a totally separate question. From my experience, using real major versions can also be very liberating for the package maintainer as it provides a clear way to communicate the type of changes in a new release that is not available when working with 0.x versions.
The approach I’ve taken is to do experimental development in a separate package from the production package. This has worked very nicely and requires no changes to semantic versioning.
I’m doing that now with the stylish-elephants alpha package for style-elements. It’s meant to not be as discoverable and has huge warnings that I’m going to be experimenting in it. It also means that I’m not shy about bumping a major version.
@mdgriffith and @rtfeldman: those are slightly different problems as neither of those cases would have really been solved with allowing a 0.X version in the package repo. That’s more to do with a pre-release tag which AFAIK the package manager also doesn’t support. (Although from a relatively cursory reading of the spec, it seems that the 0.X is somewhat redundant to 1.0.0-alpha.X).
@christian I think many of the issues you point out are very easily solvable. The API diffing capability is still there so it is relatively straightforward to ensure builds not breaking and also for pre-1.0.0 releases you should probably fix the version to a concrete version rather than a range - then when upgrading you can API diff the two versions and see whether the API is compatible.
I think the reason here is more a cultural one. NPM is full of 0.X packages that never reach 1.0 but are used in production all over the place. However there is no indication whether the package is even attempting to follow semver and so breakage can slip in any moment if you attempt to update your dependencies without paying very close attention to the project.
Furthermore, IMO in static functional languages, small single-purpose libraries are often written and the just keep working for decades without any modification. In those cases starting at 1.0 is much better than 0.1. However, relatively recently a number of highly complex packages have been showing up, where there is a need for much more exploratory work and also a lot of keeping up with change.
Whether that cultural problem has other solutions, is another question. Perhaps some features like private registries or even allowing pre 1.0 versions into the package registry but not showing it on the website would solve some of these issues.
People think “info about API changes” and “info about how stable I think my library is” should both be reflected in the version number. I don’t think this makes sense. I’d rather have a version number that gives me strong guarantees, and one of the following:
The author can tell me in the README if they are changing their API a lot.
The author can think about it more and share it on a smaller scale before publishing. Rather than spending the time of strangers on an API that may not work, just use it yourself for longer to gain confidence that it solves your problem. Get feedback from friends or at meetups. Etc.
The author will indirectly inform me how stable the package is based on major version / age of package which will indicate how often they produce major changes. If they create packages that are on version 30 in one year, that tells me a lot about their design process and how they think about their users.
Here’s my point. It sounds like the question is “how do we explore an API?” and I think the answer might be that there is no law that you have to publish everything you are exploring. So the productive way to take this conversation may be for OP to figure out “what is the thing I really want?” and we probably would not end up doing the exact same things as npm from that starting point.
Really interesting discussion. I face a related problem in that a package I have authored relies on very many union types each with many type constructors to create what is effectively a DSL. This DSL maps on to an external package under active development. If they add a new option to some function call, this legitimately counts as a minor change with their semver but in Elm if I represent it with an additional type constructor, this becomes a major version change.
In practice for users of my DSL this is not a breaking change because there is little need to pattern match againt all type constructors. But the result is either I have to slow down my release cycle and so drift behind changes made to the external package, or I risk major version inflation.
Not sure how I can avoid this problem, nor indeed the extent to which it is a real problem for users. But I’d hate to think this would be sending the wrong message to users, as perhaps Evan hints.
Elm-vega. At the moment I’m working on the vega branch that will have many hundreds of type constructors but am reluctant to release incrementally for fear of version inflation.
And an example of responding to a request for one minor addition forcing a major version bump:
If you have unstable pre-release code, but still need to share it amongst your own projects, why not use elm-github-install which does cover this use case?
@jwoLondon, I am often pretty conservative about the union types that I expose.
I wrote about these “opaque types” a bit here, but here is the idea. Say you have a type like:
type Color = Red | Blue
And you are pretty sure there are other colors that you will add eventually. Rather than exposing the Red and Blue constructors directly, I would do this:
module Color exposing (Color, red, blue)
type Color = Red | Blue
red : Color
red =
Red
blue : Color
blue =
Blue
And now, I can add more stuff as minor changes!
Now the only difference here is that you cannot do pattern matches on Color anymore. In most cases, that is actually better. The particular way type Color works may change to something better or more efficient. Perhaps it becomes type Color = RGB Int Int Int once I realize that there are lots of colors people want. Well, I can implement red and blue with the new internal RGB constructor, and the public API stays exactly the same. Great!
But maybe you really want people to be able to pattern match. Why? Is that important to what your library is about? Is it the only way? Say that you answer all those questions and say that pattern matching is needed. Well, it should be a major version bump for your users because it is going to cause all of their pattern matches to fail.
So I only glanced at the link you shared, but my instinct is that there is not a strong reason to make the type constructors public for configuration kinds of things. (As an aside, I think doing configs in records, rather than lists that can have clashing settings, is a nice path if it is possible. I wonder if it is viable in some cases if the goal is to give nicer config than raw strings.)
(mods, feel free to split this into a separate topic if you think this is hijacking the OP)
Thanks Evan - it is helpful to hear your view on best practice here and has got me thinking more deeply about my own approach to the problem.
I had considered functions and opaque types to represent the ‘configs’ of a Vega specification. Counting in their favour is the fact that pattern matching against exhaustive lists of options is not important for the API, and, as you say the flexibility to extend or change the underlying hidden representation.
However, it doesn’t feel like this scales particularly well. Vega/Vega-Lite is comprehensive and highly configurable. I haven’t counted them, but I estimate my Elm package maps maybe two or three thousand properties as type constructors. Representing each of those as a separate function each with the required overhead of a type annotation and public API doc comment seems like overkill to me. There are some savings to be made because functions can be reused between types in a way that they cannot as type constructors because of name clashes, which would be some help, but I am not sure this outweighs the cost of what would appear to be an unwieldy API for users. Nevertheless, I will try some experiments using this approach to see what it yields.
I had considered records too, but because many properties are nested, it does create quite a complex looking API for users when constructing a typical specification. And composition of parts of a specification would be much more challenging. If the representation I was trying to model with Elm was simpler, records would have been my first choice.
One way of simplifying things would be to replace specific instances of configuration options with more generic String and Float parameters, but that undoes the very advantage of Elm’s type checking that motivated me to use Elm in the first place.
I feel a design tension being pulled by a triad of conflicting demands:
Type checking to guide users when building specifications
API simplicity
Flexibility to adapt the underlying representation
It feels a little like “pick any two of the above”.
Maybe you are claiming that having docs on things makes the API more complicated though. Or that having the API be a union type is necessarily simpler, though that would imply that elm-lang/html would be simpler if everything in Html.Attributes was a single union type. So I don’t really understand that claim, but maybe I am not understanding the issue.
Do you mind giving it a shot and seeing how it goes in practice? I think the act of working through it may clarify the root concern.
Thanks for your thoughts on this, which are very helpful (and hopefully for others too).
I should have been clearer on what I meant about API simplicity. I meant the documentation of the API (for both author and reader), not the surface of the API itself. In the docs, type constructors are grouped by type and shown as a compact list of options with a single block of doc text. With sensible self-documenting names for type constructors, this is both easy to read and author. Using the function / opaque type approach, each function must be individually documented and relies on careful manual ordering of @docs comments to group by type to keep things readable.
However, I think on reflection this may be a small price to pay given the advantages of your recommended approach. So I will give it a shot, at least partially for the areas of the API where that flexibility is most needed.
As someone (possibly you) said, in Elm, more time is spent thinking carefully about API design, and less time spent by users debugging.