Working with folders

Hey everyone,
I was wondering if someone here has some experience/idea about how to upload entire folders into an Elm app. As far as I know, this option is not supported nowhere in the File package. It is possible to upload folders with HTML input element, but they get stored somewhere in the DOM if I understand correctly (specifically at the id of the input). I would love to be able to:

  1. Get a JSON schema of the uploaded folder structure
  2. Have all the files stored in Elm be it as a List File or List String (I’m mainly interested in their content)

Would love to hear you suggestions, Thank you!

It looks like you’d need to use HTMLInputElement.webkitdirectory - Web APIs | MDN and it’s associated property for retrieving the file list. Would likely be easiest to make a tiny custom element that relays the entries as an event.

Word of caution though, this doesn’t work with all browsers.

2 Likes

As wolfadex said, your best bet is to read it as a list of files. Here is a very very rough example of what we’re doing.

2 Likes

I’ll definitely try something alongs these lines, thank you both for your help :slight_smile:

Here’s a chunk of code I use for dropping files and whole folders.

Javascript part that scans the folder has to deal with large folders as browsers apparently handle only 100 files at a time. Code also ignores 0-length files.

var activeAsyncCalls = 0
var filesRemaining = 0

function scanDirectory(directory, path, onComplete) {
    let dirReader = directory.createReader();
    let container = { name: path, files: [], dirs: [] }
    let errorHandler = error => {
        activeAsyncCalls--;
    }

    var readEntries = () => {
        activeAsyncCalls++

        dirReader.readEntries(entries => {
            if (entries.length > 0 && filesRemaining > 0) {
                for (let entry of entries) {
                    if (entry.name.substring(0, 1) != '.') {
                        if (entry.isFile && filesRemaining > 0) {
                            activeAsyncCalls++
                            entry.file(file => {
                                if (filesRemaining > 0 && file.size > 0) {
                                    container.files.push(file);
                                    filesRemaining--
                                }
                                activeAsyncCalls--
                            });
                        } else if (entry.isDirectory) {
                            container.dirs.push(scanDirectory(entry, `${path}/${entry.name}`, onComplete));
                        }
                    }
                }

                // Recursively call readEntries() again, since browsers only handle
                // the first 100 entries.
                // See: https://developer.mozilla.org/en-US/docs/Web/API/DirectoryReader#readEntries
                readEntries();
            }
            activeAsyncCalls--
            if (activeAsyncCalls == 0) {
                onComplete()
            }
        }, errorHandler);
    };
    readEntries();
    return container;
}

function scanDropped(folderId, items, onComplete) {
    var container = { name: folderId, files: [], dirs: [] };
    for (let item of items) {

        var entry;
        if ((item.webkitGetAsEntry != null) && (entry = item.webkitGetAsEntry())) {
            if (entry.isFile && filesRemaining > 0) {
                container.files.push(item.getAsFile());
                filesRemaining--
            } else if (entry.isDirectory) {
                container.dirs.push(scanDirectory(entry, entry.name, onComplete));
            }
        } else if (item.getAsFile != null) {
            if ((item.kind == null) || (item.kind === "file")) {
                container.files.push(item.getAsFile());
                filesRemaining--
            }
        }
        if (filesRemaining <= 0) break
    }
    return container;
}

function readChunk(file, start, end, callback) {
    var blob = file.slice(start, end);
    var reader = new FileReader();
    reader.onloadend = function () {
        callback(reader.error, reader.result);
    }
    reader.readAsArrayBuffer(blob);
}


You also need to set up ports. The drop event comes from Elm side (scanTree), and js then returns the result of the scan through fileTree port.

        if (elm.ports.scanTree && elm.ports.fileTree) {
            elm.ports.scanTree.subscribe(function ({ e, maxFiles, folderId }) {
                if (e && e.dataTransfer) {
                    // I forgot what this limit of 50 is for. I think the application stipulates the ability to limit the 
                    // number of files that can be uploaded at once for certain users, and 50 is just an arbitrary default
                    filesRemaining = maxFiles || 50
                    activeAsyncCalls = 0

                    let items = e.dataTransfer.items;
                    let sent = false
                    var container

                    let onComplete = () => {
                        if (!sent && elm.ports.fileTree) {
                            elm.ports.fileTree.send(container)
                            sent = true
                        }
                    }

                    container = scanDropped(folderId, items, onComplete);
                    if (activeAsyncCalls == 0 || filesRemaining <= 0) {
                        onComplete()
                    }

                    // Backup in case we had a bug and undercounted activeAsyncCalls;
                    // also, send a temporary result if the scan is taking too long
                    setTimeout(() => {
                        if (!sent && elm.ports.fileTree) {
                            elm.ports.fileTree.send(container)
                        }
                    }, 400)
                } else {
                    if (console && console.error) {
                        if (!e)
                            console.error("e is null!");
                        else
                            console.error("e.dataTransfer is null!");
                    }
                }
            })
        }

Finally, relevant parts on Elm side:

port scanTree : { e : D.Value, maxFiles : Int, folderId : String } -> Cmd msg

port fileTree : (D.Value -> msg) -> Sub msg

...
-- update

        FilesDropped v ->
            case model.folderId of
                Just f ->
                    ( { model | hover = False }, scanTree { e = v, maxFiles = model.maxFiles, folderId = f.id }, NoAction)

        GotDroppedFiles ((Dir folderId _ _) as dir) ->
            let
                unroll (Dir _ files dirs) =
                    files ++ List.concatMap unroll dirs

                newBatch =
                    unroll dir |> dedupe folderId

                newlist =
                    model.files ++ newBatch

                newState =
                    case ( model.queueState, List.length newlist ) of
                        ( _, 0 ) ->
                            model.queueState

                        ( Finished, _ ) ->
                            NotStarted

                        _ ->
                            model.queueState

                -- start new queue only if not already busy
                cmd =
                    case model.activeFile of
                        Nothing ->
                            if List.length newBatch > 0 then
                                startQueue model

                            else
                                Cmd.none

                        _ ->
                            Cmd.none

                newAction =
                    case List.length newBatch of
                        0 ->
                            NoEvent

                        _ ->
                            NewFilesInQueue

                totalSize =
                    (List.map .size newlist |> List.sum) + Maybe.withDefault 0 (Maybe.map .size model.activeFile) + (List.map .size model.processedFiles |> List.sum)
            in
            ( { model | files = numberizeQueue model newlist, queueState = newState, queueSize = totalSize }
            , cmd
            , newAction
            )
-- subscriptions

subscriptions : Model -> Sub Msg
subscriptions _ =
    fileTree
            (\v ->
                case D.decodeValue directoryDecoder v of
                    Ok d ->
                        GotDroppedFiles d

                    Err e ->
                        DropError (D.errorToString e)
            )
        



-- DECODERS


directoryDecoder : D.Decoder FileTree
directoryDecoder =
    D.map3 Dir
        (D.field "name" D.string)
        (D.field "files" (D.list fileInfoWithValue))
        (D.field "dirs" (D.list (D.lazy (\_ -> directoryDecoder))))


fileInfoWithValue : Decoder ( File, D.Value )
fileInfoWithValue =
    -- We need the File value (for FileInfo), but still keep the raw value (for chunked decoding)
    D.map2 Tuple.pair File.decoder D.value

Apologies for verbosity. As you can see, it’s a cut from a larger chunk of code that also handles the upload queue, etc. I’m sure you’ll be able to simplify to get what you want.

2 Likes

I appreciate the detailed response. I’m wondering if it’s possible to obtain information about the folder structure (the entire hierarchy of the directory) with an Elm only approach, without resorting to ports. I know that the relative path is stored in webkitRelativePath attribute inside each file, but I’m not really sure about how to extract that (elm/file supports extracting size, name, etc., but not the path variable). Is there an easy way to solve this that I’m missing? Again, thank you all for your help thus far :slight_smile:

It’s more robust to recursively read your directory structure with the directoryReader and construct your paths from directory names then bother with reading and parsing the path off the file object.

To read the directory structure you have to scan with directoryReader and distinguish between folders and files (.isDirectory/.isFile). Elm implementation just wraps the file object and actually checks if the blob is a file before wrapping it by calling isFile on it, so you can really only push it files after you cast them with .getAsFile().

1 Like