Write CLI scripts in Elm (IO Monad)

The API is really good, clean and simple. Two questions:

Does callJs allow for (almost) arbitrary code execution?

Do you have any plans for file I/O or http servers?

1 Like

I think the module shouldnā€™t be named ā€œPosixā€ anymore. Only IO or something similar. And then ā€œIO.Posixā€ could have convenience functions for working with Posix APIs.

I also think that this pattern is also useful for usage in the browser. It reminds me of elm-porter or elm-procedure. The only thing Iā€™m missing from them is some kind of running process management. But that might be a topic for another post.

I think itā€™s a crucial missing feature of elm, absence of meaningful CLI. Imagine you teach a class using elm (well, I tried few months ago, it was not much fun, even though students had prior Haskell exposure, not the least due to absence of CLI, so the learning curve was too steep for many).

Given it is a FE language, I think any lessons should be HTML focused, not cli focused :man_shrugging:

But I miss the opportunity to write simple scripts with Elm as well, so Iā€™m really looking forward for this. Great idea @albertdahlin :+1::heart:

1 Like

I have been using Elm for teaching FP (and visualization) for a few years now, and use litvis for literate Elm without having to confuse students with TEA or html.

One thing Iā€™ve missed though is the ability for direct file IO and it looks like elm-posix does exactly the job, so really helpful.

1 Like

I taught computer graphics (and will do next year as well), and itā€™s not only FE. Sometimes Iā€™d like to compute, say, a vector in 3d - FE has nothing to do with it, it just gets in the way.

1 Like

This is really cool, I think elm while being frontend focused should still have support for clis scripts so that the community can make frontend tooling in elm itself. That would really drive the tooling ecosystem forward.

1 Like

I found the andThen pattern to be quite readable and avoids those wildcard parameters:

module HelloFile exposing (program)

import Posix.IO as IO exposing (IO, Process)
import Posix.IO.File as File
import Posix.IO.Process as Proc


program : Process -> IO ()
program process =
    case process.argv of
        [ _, inName ] ->
            -- Print to stdout if no output file provided
            File.contentsOf inName
                |> IO.exitOnError identity
                |> IO.andThen (processContent >> Proc.print)

        [ _, inName, outName ] ->
            File.contentsOf inName
                |> IO.exitOnError identity
                |> IO.andThen (processContent >> File.writeContentsTo outName)

        _ ->
            Proc.logErr "Error: Provide the names of the input file and optionally an output file.\n"


processContent : String -> String
processContent =
    -- Do something with the content of the input file here.
    String.toUpper

1 Like

I have been working on something that could use this. Currently Iā€™m using ports and Platform.worker called from a NodeJS script.

My idea for this package is to make it more ergonomic to write simple CLI scripts, for example dev tools and build pipelines. I also want to make it as open as possible to allow others to implement their own I/O in nodejs. Supporting HTTP-servers is however out of scope for now.

Iā€™m currently working on specifying the API for the following I/O modules:

  • Read / write files.
  • File system, (move, copy, delete, list etc).
  • Http client, make http requests.
  • Spawn sub processes and run shell commands.

My plan for callJs is to simplify user specific I/O implementations and allow people to experiment.

CallJs Example

CallJs is just a wrapper for communicating with the ā€œoutsideā€ using two ports.
To create your own project specific I/O implementation my current thinking goes along these lines:

Javascript

Create a standard nodejs module that exports an Object with all your I/O functions, e.g

js/my-functions.js

module.exports = {
    addOne: function(num) {
        this.send(num + 1); // this.send is a normal Elm port
    },
}

Elm

Then create an Elm module where you supply the Encoder / Decoder for communicating with Javascript, e.g src/MyModule.elm

addOne : Int -> IO x Int
addOne n =
    IO.callJs
        "addOne"
        [ Encode.int n
        ]
        Decode.int

program : IO.Process -> IO String ()
program _ =
    ...

Put it together

When you run or compile you can add arguments to supply your I/O implementation, something like this:

elm.cli run --extend-io js/my-functions.js src/MyModule.elm
2 Likes

My idea for this package is to make it more ergonomic to write simple CLI scripts, for example dev tools and build pipelines. I also want to make it as open as possible to allow others to implement their own I/O in nodejs.

Cool!

Supporting HTTP-servers is however out of scope for now.

Fair

Files API

While the API for reading and writing a whole file at once can work for simple cases, having a streaming API is definitely something one eventually wants. On the other hand the 1.0 API you published has the obvious problem (that I guess youā€™re aware of) wrt writing to closed files.

Ideas:
One could imagine a withFile : Filename -> (FD x -> IO e a) -> IO e a API that automatically closes the file at the end, BUT the problem is that you would be able to ā€œsmuggle outā€ the file descriptor via return.
What if we donā€™t expose the file descriptor, but instead withFile : Filename -> ({read : Int -> IO e String, write : String -> IO e ()} -> IO e a) -> IO e a? Again, you can just return the read and andThen it out, so doesnā€™t work.
We could force a to be () and then one wouldnā€™t be able to smuggle out anything, but you could only use that for writing, not for readingā€¦ not great.
We could have an opaque type FileIO e a = FileIO (IO e a). Then withFile : Filename -> FileIO e a -> IO e a and then read : FileIO Err String, write : String -> FileIO Err () + monad operations for FileIO but then you canā€™t allow nesting withFile calls because we either get back to the smuggling problem or you have an API where only the inner file can be read/written to inside the nested call. Not flexible enough I think.

Uhm. You could restrict the return to be an IO Value where Value is the Json.Encode one butā€¦ meh, itā€™s unclean.

Iā€™ll think a bit more about this.

CallJs

Javascript

Why not have the functions return the value (instead of calling send)? You avoid both double-send and no-send that way

Elm

Great!

Put it together

Simple, I like it!

Files API

I think the ideas you are writing about are interesting, thank you.

I want to be pragmatic in the design of the API. The ā€œgut feelingsā€ I have been basing my decisions on are something like this:

  • 90% of the use cases for this package would be satisfied with just reading / writing whole files at once.
  • Writing to closed file descriptors are just one of all the problems you can have and need to watch out for. I am afraid that solving this without something like linear types would make API feel very complex. But of course I am open to suggestions and I will try to think in the lines the ideas you are describing.

Call JS

I had that in the beginning but then I wanted to do async stuff, like

sleep: function(delay) {
    setTimeout(this.send, delay);
}

so I went with the more flexible approach of using callbacks. My thinking was that increased flexibility outweighs the increased risk. Iā€™m open to other opinions or suggestions here though, maybe it is better to revert to returning values.
Is it maybe a better idea to have the functions return either a value or a Promise to allow both styles?

2 Likes

Files API

Yupp

I mean, most errors are not avoidable at the API level: write errors, permission errors, full filesystem, etc etc etc, can and need to be managed via Result/IO err, as you already do.

Writing to a closed fd feels like one of those problems that are be fixable with a Strict Enough API but:

  1. not sure it actually exists in the Elm typesystem
  2. not sure, if it exists, if it is nice enough to be worth using over a trivial withFile : (FD -> IO e a) -> IO e a
  3. not sure, if it exists, if itā€™s flexible enough for all usecases

Something something records keeping tracks of open files via type level shenanigans, elm-css-ish style.

I think allowing either a value or a Promise should be flexible enough yet clean and avoid the issue.

1 Like

Hey @albertdahlin,

I have finally get to try your elm-posix out. I have tried the HelloUser example got two problems. First problem was when I have tried to run it I have got:

You found a bug!
Please report at https://github.com:albertdahlin/elm-posix/issues

Copy the information below into the issue:

IO Function "sleep" not implemented.

and second one was when I have tried to make and run the compiled version which gave me:

test.js:3518
    fn.apply(app.ports.recv, msg.args);
       ^

TypeError: Cannot read property 'apply' of undefined
    at Array.<anonymous> (/Users/tomas.latall/test.js:3518:8)
    at Function.f (/Users/tomas.latall/test.js:2228:19)
    at A3 (/Users/tomas.latall/test.js:68:28)
    at Object.b (/Users/tomas.latall/test.js:1987:7)

any hints what I might have missed?

It looks like the Elm package and the npm package are different versions. Try updating both, latest version is 1.0.2.

Thanks to everyone who have provided feedback and ideas so far. This inspired me to working on an improved version of this package. It now includes

  • Read / write file contents.
  • File system operations like copy, mkdir etc.
  • Exec shell commands and spawn child processes.
  • Fetch data using HTTP with elm/http tasks.
  • Stream API with composable pipes.
  • API to simplify implementing your own I/O in javascript.

This is more or less the feature set I have planned to include for now. Is there anything crucial I have missed?

Next step is to implement everything, polish the API and improve the documentation. Feedback and ideas to improve the API is very welcome and appreciated.

5 Likes

An idea rather than a proposal: why not move everything to IO instead of Posix.IO?

  • The Task interop API is good.
  • I guess loops in IO are outside the scope of this package? (One can andThen recursively, but I guess it would run out of stack relatively quickly. Not even sure what an API for a loop would look like though to be honest).
  • The WriteMode is So Much Better than the 'a/w/w+/zomg` API
  • Iā€™m not in love with write_ being able to return a ReadError, API-wise, but I think I can see why it makes sense to avoid complicating types
  • ToManyFilesOpen ā†’ TooManyFilesOpen
  • The example for openWriteStream has maybe the wrong type?
  • exec: I can see why you called it this, but exec has a specific meaning in Posix (replace currently running code with given application), this is more like a system. You may call is spawn?
  • kill: maybe also allow to send signal different than SIGTERM/SIGKILL?
  • send: is this a ā€œwrite into program stdinā€ function? if so, the doc is not completely clear
  • bytes: please specify the encoding used
  • gzip/gunzip: strongly consider Stream Bytes Bytes instead, it could be compressed binary data
  • read: ā€œsize represents different thingsā€ā€¦ uhmā€¦ Iā€™m not in love with this. I can see it making sense, but I donā€™t love it. I guess adding the ā€œsize typeā€ to Stream would make all the types longer and more annoying but stillā€¦ :thinking:
  • write: the documentation is incomplete, it can also write bytes
  • run ā€œisā€ ā†’ ā€œareā€
  • pipeTo: I really like this API. I really love the whole Streams API in general tbh.
  • I think you could also expose some toStream : (a -> b) -> Stream a b?
  • It would be nice to have read_ and write_ with typed errors I think. In particular, what if I want to read until EOF, but not use run?

Thank you very much for the feedback. This was really helpful.

Fixed above

I have added read_ and write_ to the stream module. I will continue to work through the Error types.

I tested to do a recursive forever loop with andThen and it does not blow up the stack in examples/src/Forever.elm. There are for sure cases where the stack will blow up but I would say this is good enough for now.

Maybe this is better?

type ReadError
    = CouldNotOpenForRead OpenError
    | ...

type WriteError
    = CouldNotOpenForWrite OpenError
    | ...

The current Error type made more sense when I had openReadWrite.

  • The names exec, execFile and spawn are taken directly from the nodejs child process module
  • I have added all signals as a union type. I guess we donā€™t need all of them?
  • I have removed the send and receive functions and have spawn returning streams for stdIO instead.

I agree, Iā€™m not thrilled about this solution either. Maybe we can do better?
How about something like this?


read : Stream x output -> IO String output
chunkBytes : Int -> Stream Bytes Bytes
chunkString : Int -> Stream String String

{-| Will read at most 10 bytes from stdIn each time. Might read less
if source stream is exhausted (EOF is reached). -}
read10BytesFromStdIn : IO String Bytes
read10BytesFromStdIn =
    stdIn
        |> pipeTo (chunkBytes 10)
        |> read

{-| Since no "chunker" is applied this will read until EOF. -}
readEverythingAtOnce : IO String Bytes
readEverythingAtOnce =
    stdIn
        |> read

I think this is a better API

I think this is a bit more involved since it would require handling buffer or generator states in most cases so I decided to skip it for now.
Consider these examples:

naturalNumbers : Stream Never Int
tuple : Stream a ( a, a )
split : Stream ( a, a ) a

I will consider this when the Iā€™m done specifying the API.

1 Like