Evolving the API of elm-bigint

In the process of updating elm-bigint for 0.19, there seem to be obvious areas to improve the current API. I’d like to tap into the community for some feedback and ideas before I go on with these changes.

In the current API, there are two main functions which can produce a BigInt:

fromInt : Int -> BigInt

fromString : String -> Maybe BigInt

fromString accepts stringy ints as well as stringy hex, e.g.

fromString "39456780218509202943875"
fromString "-1234"
fromString "0x1ef8deba7"
fromString "-0x452dcba3"

There are also two functions to get you out of BigInt land:

toString : BigInt -> String

toHexString : BigInt -> String

The main problem is the disparity between the consumption and production a hex strings.

BigInt.fromString "0xff" |> Maybe.map BigInt.toHexString == Just "ff"

The two questions I’d like to pose the community are:

  1. How do you feel about the "0x"? We should make fromString and toHexString consistent, but which way do we go?
  2. Does it feel cleaner to remove fromString and split it into fromIntString and fromHexString?

Personal thoughts:

  1. I like the "0x" on the front of hex strings, much like the "0b" which denotes binary. It makes it explicit what you’re dealing with.
    On the other hand it doesn’t keep consistent with other elm libraries that deal with hex, like rtfeldman/elm-hex where "0x" is invalid, as the library was aimed at producing hex for elm-css. Lastly, if we remove "0x", then a fromHexString will be absolutely necessary, as fromString "123" is both a valid hex and int string.
  2. Regardless of the choice made above, splitting fromString seems like the right thing to do. It separates concerns, makes the API more explicit, and by extension makes documentation easier. (I didn’t even know elm-bigint could handle hex strings till I looked at the source code several months into using it.)

Appreciate any thoughts!

Intuition/First Idea:

  • Use fromStringBase16 to convert from a string with base 16.
  • Use fromStringBase10 to convert from a string with base 10.

As a consumer of such a library, I would start with the assumption of symmetry between the functions to convert from string and to string. Therefore I would go with (at least also) offering symmetric conversion functions.

There is a function to convert from String to Int in core. Because of this context, when I look at a symbol like fromString, I am tempted to expect it behaves consistently with the core conversion function. Testing String.toInt in ellie, it seems to return Nothing for a string like 0x1ef8deba7.

There could be a function for those cases:

fromStringWithPopularPrefix : String -> Maybe Int
fromStringWithPopularPrefix string =
    if string |> String.startsWith "0x" then string |> fromStringBase16
    else if string |> String.startsWith "0b" then string |> fromStringBase2
    else string |> fromStringBase10

This is just first intuition, before the first coffee, there might be better symbols to discover.

Here is my opinion on your questions:

  1. It makes more sense to not return it from toHexString since it’s easier to tuck it on if you want it than to remove it (or at least more obvious what you do). Not having “0x” is more general, in that way. So I’d say skip the prefix, and have the user of the package handle it. They will have to pre- and post-process the strings anyway to match with their exact use case.
  2. Looking at my answer to 1., yes there will have to be multiple functions.

I would like fromString to continue to be flexible, accepting "0xfacebead" for hex.

There should be fromHexString as a mirror of toHexString. fromHexString would NOT accept any prefix, so fromString("0xfacebead") would return the same number as fromHexString("facebead"), and toHexString applied to that number would yield "facebead".

But I don’t feel strongly about it.

Interesting idea. A little redundancy in the API, but a nice flexibility as well.

The question is, what is the use case for a function that accepts multiple forms of strings? Is it so that we can handle unfiltered user input?

In an actual program, surely the programmer knows (or should know) which form their strings are?

1 Like

@norpan, that’s more or less my opinion on the matter as well, and what I am leaning towards with this API:

fromIntString "1234"
fromHexString "4bed"
fromBinaryString "110101"

toIntString ...
toHexString ...
toBinaryString ...


fromStringBase10 "1234"
fromStringBase16 "4bed"
fromStringBase2 "110101"

toStringBase10 ...
toStringBase16 ...
toStringBase2 ...

I think the first API is more pleasant and casual, but the second API is more precise and unambiguous. I’m leaning towards the first, as it feels more “elmish”. Thoughts?

Edit - Also considering moving from Maybe to Result

What about changing fromIntString to fromDecimalString? That’s as precise as the numerical one, but also kinda nicer.

Hmm. Though something like BigInt.fromString "12.42" is invalid since only integers are accepted.

In Elm 0.18, String.toInt returned Result String Int. In 0.19, it returns Maybe Int. I think it makes sense for the elm-bigint package to follow suit.

1 Like

You could provide both the mnemonic names like {to/from} + {Hex/Binary/Decimal} + String, as well as a functions like

toStringBase : Int -> BigNum -> String
fromStringBase : Int -> String -> BigNum

Where the caller can supply any base (from 2-36). I imagine that there would be at least some use for en/decoding octal. The other bases are not likely to come up very often, but if someone needs them then this API caters to them. fromHexString and friends could be implemented simply as fromHexString = fromStringBase 16.

Tricky part here is how do you handle, fromStringBase 777? You’d have to throw some Maybe's in the return :-\

You could use a custom type instead to avoid the problem above, though I would question the need to implement everything from base3 to base36, at least until it’s become apparent that people desire this functionality, as it would require a fair bit more work on the library to support all those bases.

One way to make this API safer is to let the consumer specify the list of digit values instead. To model the constraint that a character can only be used once as a digit value, this list could be made a set and the order be inferred by convention in form of a popular character encoding (such as ASCII):

fromBinaryString : String -> Maybe BigNum
fromBinaryString = fromStringWithCustomBase (['0', '1'] |> List.map Char.toCode |> Set.fromList)

fromStringWithCustomBase : Set.Set Int -> String -> Maybe BigNum

Quite confused about this approach. Seems to have the same exhaustiveness issues as Int’s. e.g. fromStringWithCustomBase Set.empty "what happens now?"

Taking all the suggestions into consideration, I feel the API below is quite nice. It includes commonly used functions, but allows extensibility and “low level access”.

-- Commonly used

fromIntString : String -> Maybe BigInt
fromIntString = fromStringWithBase Base10

fromHexString : String -> Maybe BigInt
fromHexString = fromStringWithBase Base16

-- Low Level

type Base
    = Base2
    | Base10
    | Base16

fromStringWithBase : Base -> String -> Maybe BigInt
fromStringWithBase base str =
    case base of
        .... ->
1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.