Text Editing Part 3/3 Keyboard Input and Select/Copy/Paste

The final part in this 3 part mini-series on text editing in Elm.

Now I want to figure out what is the best way of capturing input. So far, I captured keyboard events with onKeydown and processed those in Elm. That actually seems to work very well for my standard layout GB-English keyboard, and as an English speaking only person, I must admit a gross level of ignorance about IME, non-English keyboards and so on. Full unicode support, for example, is probably too big a topic for this spike - getting the basics working is really what I am aiming for here.

I think there are 3 ways. One is to handle all keyboard events in Elm. Two is to have a hidden textarea and capture the user input via the browser. Three is to use a content-editable and capture the user input via the browser. Since I am borrowing lessons learned from CodeMirror (https://codemirror.net/doc/internals.html), I will try approach two first.

That said, if anyone has some thoughts or opinions on the best way to go about this, I would very much appreciate hearing them.

Aims of this spike are:

  • Capture and display all keyboard input to the editing area.
  • Allow text selection with the mouse by click and drag.
  • Allow text selection with Shift+Arrow keys.
  • Copy, Cut and Paste via the clipboard.
  • Try to do it in pure Elm if possible, so that the complete code can be published as a package. Elm ports or custom elements only if it is really necessary.

The code will be continously published to:

Code here:

I’d suggest turning the pitfalls listed in this brilliant medium article into a checklist – make sure that you avoid the ones you care about.

Of specific note, if you are pursuing the hidden textarea approach, you risk:

How Do I Type Using My iPad? – When I tap the editor, the keyboard does not appear.

Alt+Left/Right Arrow Should Jump Over One Word – This text ‘ພາສາຈີນແມ່ນພາສາໜຶ່ງທີ່ເວົ້າໃນປະເທດຈີນ.’ contains many words, but your editor handles it as one word.

Undoing by Shaking Does Not Work on iPhone

Keyboard Is Hidden When I Do a Selection on iPhone/iPad… Focus moves from the hidden textarea to the document…

Spell Checking Doesn’t Work

I Can’t Use Your Editor with a Keyboard and a Screen Reader – accessibility fail

I recently tested figma.com, which uses the hidden textarea, and they hit every pitfall on the list.


So, that’s about your choice of input mechanism. The next hell is what you’re using to understand inputs.

Keypresses rapidly break down once IMEs come on the scene – on a Mac, type option-backtick-e and you get è, which will never pass through the keypres world. beforeinput and input events are better, but the sequence of events is pretty wild for the Mac sequence I gave above. beforeinput` is nice because it’s cancelable, so you can see the user intent, and then replace it with your own operation. But mdn says firefox doesn’t support beforeinput although quick hackery suggests maybe it does now? There’s also the question of how old a browser you want to support. Etc.

The final fallback is mutation observers, which you may need to make sense of something that isn’t accurately captured by input events alone.

Thanks for the helpful feedback. Unlike the previous spikes on scrolling and buffer implementation, I think I will be spending more time researching on this one, before getting stuck into the code, so its super helpful to get some opinions.

Are you referring specifically to option 1, handling keyboard events in Elm here? Or do you think this also applies to 2 and 3?

I think we discussed briefly on Slack, and you were in favour of using contenteditable? CodeMirror does also use contenteditable and its a config option (https://codemirror.net/doc/manual.html - Search for the ‘inputStyle’ option).

I think contenteditable would need to be wrapped as a custom element to work with Elm? Since it changes the DOM, it would not play nicely with virtual-dom otherwise. But that isn’t such a huge problem, https://package.elm-lang.org/packages/mweiss/elm-rte-toolkit/latest/, is for contenteditable and it just tells you in the README where to get the npm package for the implementation.

Are you referring specifically to option 1, handling keyboard events in Elm here? Or do you think this also applies to 2 and 3?

I think this applies regardless – if you want to get the benefits of using contenteditable, you want to get away from keypress events as much as you can for all the reasons outlined in the medium post. The idea of capturing user intent is what beforeinput and input get you – check out https://rawgit.com/w3c/input-events/v1/index.html#interface-InputEvent-Attributes

I think we discussed briefly on Slack, and you were in favour of using contenteditable? CodeMirror does also use contenteditable and its a config option (https://codemirror.net/doc/manual.html - Search for the ‘inputStyle’ option).

I think using contenteditable directly lets you leverage the most native capability from the browser. If you’re at Google writing the next Google Docs, do whatever you want, you have ~infinite resources, otherwise use contenteditable :slight_smile:

I think contenteditable would need to be wrapped as a custom element to work with Elm? Since it changes the DOM, it would not play nicely with virtual-dom otherwise. But that isn’t such a huge problem, https://package.elm-lang.org/packages/mweiss/elm-rte-toolkit/latest/, is for contenteditable and it just tells you in the README where to get the npm package for the implementation.

Yep, that is the drawback of contenteditable from Elm.

I think if I am going the contenteditable route, it might be easiest to build on top of mweiss/elm-rte-toolkit, since that has already solved a lot things.

I did start out by trying to use it, but then I wanted a gap buffer editor model, and that did not align with how its content model works. With the elm-rte-toolkit you don’t write elm/html directly, you build ElementNodes and write pairs of functions to implement an isomorphic mapping between the html and the document model that it uses. That is the reason for elm-rte-toolkit having its own ElementNode way of writing html - Html.Html is opaque so you can build it but not query it.

So I will need 2 levels of indirection to build the view, my gap buffer, and the elm-rte-toolkit model (which itself is 2 levels of indirection, so really 3 over all). I was worried that would be inefficient, but with the virtual scrolling in place, perhaps that will not be an issue.

Seems worth a try.


Just keeping this link here, its a CodePen for CodeMirror. Useful for trying out contenteditable mode to get some insights into how that works.

So I have played around a bit with trying to do this through the elm-rte-toolkit but I cannot see how to get it to work.

With the toolkit, you define an editor model as a RichText.Model.Node.Block which may contain further blocks or inlines to form a structured document. You then define functions to translate this document to and from HTML described using
RichText.Model.HtmlNode.HtmlNode. So there is a relationship like this:

Block <--> HtmlNode

The problem is that I have my GapBuffer model, which does not fit into this model. I can write a function to turn a GapBuffer into a Block, but there is no way to hook into the update cycle and feedback changes in the Block into the GapBuffer. It appears the Block model is intended to capture the entire document, not just a window into the GapBuffer, and the way it does this does not leave room to do so in an efficient way like I did with the GapBuffer.

I have:

GapBuffer --> Block <--> HtmlNode

But what I need is something more like:

GapBuffer <--> HtmlNode

So there may still be some code I can borrow from the elm-rte-toolkit. Specifically the HtmlNode rendering, and the code it uses to update the HtmlNode model from the HTML as the user edits it. But it is not going to be as simple as just implementing the editor on top of the toolkit.

Or perhaps I need to aim for a tighter integration between the GapBuffer model and the contenteditable without the intermediate HtmlNode level.

1 Like

I’m not totally convinced this is true or the reason a custom element is needed. I see with the elm-rte-toolkit that although a custom element is used, the elements inside it are rendered by an ordinary Elm view function. I conclude they are managed by the virtual DOM, but also editable. Perhaps it is just that the edits update the model, and generate a matching view to the edits, so the virtual DOM diffing works out correctly anyway?

The custom element in elm-rte-toolkit really just exists to hook up some event handlers in order to expose them to the Elm program. Looking at its constructor gives a good overview of what it is setting up:

    constructor() {
        this.mutationObserverCallback = this.mutationObserverCallback.bind(this);
        this.pasteCallback = this.pasteCallback.bind(this);
        this._observer = new MutationObserver(this.mutationObserverCallback);
        this.addEventListener("paste", this.pasteCallback);
        this.addEventListener("compositionstart", this.compositionStart.bind(this));
        this.addEventListener("compositionend", this.compositionEnd.bind(this));
        this.dispatchInit = this.dispatchInit.bind(this)


This is a big help actually. I can likely borrow this custom element but re-write the Elm side logic to better fit my aims.

Interesting. Does elm-rte-toolkit intercept inputs (by canceling beforeinput events) before they happen and apply them internally? That would solve the problem of contenteditable DOM changing out from under Elm, although it would break with composition events from IMEs which aren’t cancelable.

It would appear to be able to, yes:

But I also know it listens to MutationObserver events:

I think the strategy for managing the DOM, is to listen for mutations, then update the internal model to match how the DOM is being edited. Then when the virtual DOM diffing runs, it should find that the view and the DOM are already aligned.

I’m not totally clear on the details yet, its big package.

1 Like

But it only does preventDefault when the beforeinput event is actually processed by it, which by default it does not.

You can build a command map to customize behaviour, here is an example from the docs:

    |> set
        [ inputEvent "insertLineBreak", key [ shift, enter ], key [ shift, return ] ]
        [ ( "insertLineBreak", transform insertLineBreak ) ]

So this will intercept Shift+Enter, but all other beforeinput events will fall through to the browser handling.


For my text editor, pehaps all I need to do is to intercept the input events that can lead to structural DOM changes. Those being, Enter, Delete and Backspace, since those can add or remove lines. Everything else is just character data modification on the line.

The full set of input types includes a number of other things that can modify the dom – lots of format* operations, for instance, which you get by hitting, e.g., Cmd-I to italicize text. But you could intercept and cancel most of those in favor of your own modifications I expect.

1 Like

For a code editor, I would want to just ignore formatting - since there will be code highlighting doing that automatically.

This seems to be a list of all the different kinds of input events:

That makes sense. You probably care about paste, too.

Indeed. The custom element code from elm-rte-editor is doing:

  pasteCallback(e) {

    const clipboardData = e.clipboardData || window.clipboardData;
    const text = clipboardData.getData('text') || "";
    const html = clipboardData.getData('text/html') || "";
    const newEvent = new CustomEvent("pastewithdata", {
      detail: {
        text: text,
        html: html

Which will give me the value to paste as text, and I am only interested in pasting text. That seems not hard to handle.

I think what is going to be harder is selection. If the user selects some text, then pulls the mouse off the bottom of the editor area, it will scroll down expanding the selection over a region that is bigger than what is visible. But I have virtual scrolling, so I don’t think the browsers understanding of the selection versus what I want will align. No idea yet how to handle this, perhaps it might even be better to not use the browser selection at all.

Right, found what I couldn’t nail down earlier. Firefox doesn’t support beforeInput yet. You can watch it fail on https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/beforeinput_event

This Slate bug has a nice set of pointers to underlying Firefox issues. Looks like it’s implemented but hidden behind a flag still. https://github.com/ianstormtaylor/slate/issues/3185


At the moment, I am preventing default on certain key presses on the “keydown” handler. That seems to prevent them from becoming input events. So long as I can rely on this, and certain keyboard inputs like Ctrl+I, to prevent formatting events, I won’t actually need to intercept “beforeinput” for those.

So the consequences of running it on Firefox should just be that IME doesn’t work - but that’s the browsers fault and not something I can fix anyway.

So now I have run into the classic problem of making changes to a contenteditable from code resulting in the caret not knowing where it should be, so jumping back to the start of the line.

This can be solved with javascript that is specifically wrapped around the updates to the elements text contents:

The problem in Elm is that the updates to the text happen in the virtual dom code, so I cannot modify its behaviour to save and restore the caret position like this.

To work around this I had to do something a bit icky. Essentially, I capture the mutation event, and figure out from that what the users editing intention was, and insert the text in the right place in the Elm-side Model. When the view is re-rendered, the line is replaced entirely with the correct text. The contenteditable caret is made invisible and my own custom caret is rendered in Elm. It does not really matter where on the line the browsers caret is, so long as it is on the right line.

I also ensure that when edits happen on a line, the whole line is forced to re-render. This is done with Html.Keyed by bumping a counter on each edit, so that a unique key can be set against the edit line. Since this changes on each edit, the virtual dom forces a redraw of the line. This helps when syntax highlighting is triggered, since that makes structural changes to the DOM, in the view code.

I’m definitely getting into brittle code territory with this, so don’t feel great about it - am I relying on specifics of the vdom implementation? and is this going to work nicely on all browsers? Can’t really see another way though.


In that stack overflow question, the code to maintain the caret position whilst making changes to the text looks like:

    var position = getCaretCharacterOffsetWithin(input.get(0));
    var text = input.text();
    text = text.replace(new RegExp('\\btest\\b', 'ig'), '<span style="background-color: yellow">test</span>');
    setCaretPosition(input.get(0), position);

A pseudo-code outline is:

  1. Get the caret position.
  2. Update the text/html in the element.
  3. Set the caret back to the correct position.

The Elm virtual DOM just does step 2.

I also tried to complete the algorithm by storing the current caret position in my Model (step 1), and then passing that as an attribute into the elm-editor custom element and have it do step 3.

The problem with that is that the attribute seems to get updated by the virtual-dom first, and then the contents. So it gets steps 2 and 3 the wrong way around. The result is that the caret ends up at the start of the line.

I also tried using an animation frame event to do step 3. It kind of works, but you can see the caret ever so briefly appears at the start of the line before jumping back to the correct place. The result is, if you type fast or fat finger the keyboard, you often get characters appearing at the start of the line, instead of where they are meant to go. I wasn’t really expecting this approach to work…


Time to step back a bit I think and work out what a more robust solution is going to look like.


Ok - The plan…

I think having the virtual-dom render HTML that is contenteditable is never going to play nice. It does work in limited scenarios, but when you start making changes on the Elm side that need to be merged into the browser managed editor the selection gets destroyed, meaning that the caret ends up in the wrong place.

So I am going to try and keep them separate, but have some custom rendering code on the JS side to merge in view changes from Elm, whilst preserving the selection. This is needed so that the pseudo-code steps 1 to 3 above can be performed in the right order, and immediately during the DOM construction.

Edits through the browser, will also need to be passed up to Elm, through a custom event triggered by a mutation observer.

Here is how it will work:

  • The view rendered by Elm will be hidden.
  • This view will be rendered inside a custom element for the editor.
  • When the view changes, it will be copied/merged inside this custom element. First pass implementation, I will just copy it - this will evolve into a custom DOM diffing algorithm to make it more efficient once the basics are working.
  • The copied view will be contenteditable.
  • When copying/merging the view, the selection will be preserved, to keep the caret in the correct place.
  • Edits to the copied view will be made available to Elm as custom events. The event will contain the entire line as a String. Formatting on this String may result in changes to the view that get merged in to the copied view.
  • Can make use of data-* attributes to mark the content root, and line numbers and so on, to help make it easier for the merging algorithm to find its way around the text buffer. Basically, so it can easily figure out where the start/end of a line is, and count characters on a line.
Custom Element                                 Elm

  Line Edits   ------------------------->  LineEditEvt String

  Merge View <----------------------  Update view with formatting


This relies on the Elm virtual DOM diffing algorithm not minding extra elements being inserted in the DOM in addition to the view it is rendering. In that sense it is brittle to changes in this algorithm.

I considered copying the view inside a shadow DOM. But then I googled for browser support or bugs for contenteditable in the shadow DOM… :astonished: Still an option on some browsers though.


This is effectively a double buffered virtual DOM diffing algorithm. In the second diffing, I copy real DOM to real DOM, which could be a bit slow. On the other hand, much of the time, there will be no changes as browser edits remain in sync with the model. Less often, there will be changes to merge due to code highlighting, resulting in changes to 1 line. Sometimes there will be changes to many lines, due to scrolling, or rippling code highlights.

At any rate, I experimented with copying the entire text buffer view and it was not slow at all. Already I have better performance than needed, so plenty head room for doing the DOM copying.

Here is the code I experimented with copying the entire buffer with:

  animationCallback() {
    var element = this;

    requestAnimationFrame(function() {

      var clone = element.firstChild.cloneNode(true);

      if (element.childElementCount > 1)

1 Like