Similar to Google’s ProtoBuf, the OpenAPI Specification (OAS) defines a standard, programming language-agnostic interface for data interchange. ProtoBuf uses .proto
files, OAS uses YAML files. There is an OpenAPI Generator that can be used to generate server stubs, client libraries or documentation for several languages. As the OAS Generator’s template creator for Elm (See example) I’ve always been intrigued by the ability to generate Model
s & Request
s automatically. This both saves me from write encoders/decoders manually and makes it impossible to break the contract between server and client.
As I got inspired by Evan’s vision for data interchange and Elm now supports processing bytes, I started playing around and trying to parse ProtoBuf messages in Elm. My current API is mostly inspired by the existing Elm JSON encoder and decoder packages. I am running into an issue concerning decoding.
In contrast to decoding JSON, my data is not stored in a clear (JSON) tree structure:
- The order of the fields of the record I am decoding to may be different to the ordering of the field numbers (i.e. the ordering of data in the
Bytes
) as defined in the.proto
; - When decoding a
repeated int32
I do not know in advance how manyint32
s I will find in the bytes sequence. Those are not nicely put together like in JSON but are stored sequentially.
Therefore, I start my decoding process by first creating a Dict Int (List Bytes)
. The Int
contains the field number and the list contains all the matching chunks of Bytes
I run into while processing the whole range of Bytes
. For this I use Bytes.Decode.loop
as I keep filling the Dict
until I run out of Bytes
. And that’s currently my main issue. I do not know when to stop the loop. When I (with some workarounds) have generated this Dict
, I use it to run field decoders on the stored chunk(s) based on each field number.
Here a some core functions to summarize the approach a bit:
AllBytes
are cut into pieces and collected intype alias Chunks = Dict FieldNumber (List Bytes)
, wheretype alias FieldNumber = Int
;message : a -> Decoder (Chunks -> a)
is used the start decoding into typea
;-
field : FieldNumber -> FieldDecoder a -> Decoder (Chunks -> a -> b) -> Decoder (Chunks -> b)
is used for decoding additional fields using any of theFieldDecoder
s:float : FieldDecoder Float
string : FieldDecoder String
int32 : FieldDecoder Int
repeated : FieldDecoder a -> FieldDecoder (List a)
-
etc.[edit: outdated]
These could be used to take a .proto
file
syntax = "proto3";
message SearchRequest {
string query = 1;
int32 page_number = 2;
int32 result_per_page = 3;
}
and to generate
import Bytes exposing (Decoder)
import ProtoBuf.Decoder exposing (field, int32, message, string, toDecoder)
type alias SearchRequest =
{ query : Maybe String
, page_number : Maybe Int
, result_per_page : Maybe Int
}
searchRequestDecoder : Decoder SearchRequest
searchRequestDecoder =
message SearchRequest
|> field 1 string
|> field 2 int32
|> field 3 int32
|> toDecoder
However, to make this work I need the map the last field
's Decoder (Chunks -> b)
output to a normal Decoder b
that can be passed to e.g. Http.expectBytes
. That is what toDecoder : Decoder (Chunks -> a) -> Decoder a
should do. To be able to generate the Chunks
I need to know when to end the Bytes.Decode.loop
. This would be easy if I knew the number of Bytes
I initially receive. As far as I know there is no way to do this with the current elm/bytes
. I could of course rewrite ProtoType.Decoder
only to process Bytes -> a
(and use e.g. Http.expectBytesResponse
) but that doesn’t feel like a clean solution to me. Also it is very inconsistent with other libraries. Am I missing some obvious alternative solution here or should I aim for requesting an additional Bytes.Decode.width : Bytes.Decoder Int
? Thanks for sharing your thoughts!
To clarify: I am currently working on getting the encoding/decoding to work. Eventually these encoders/decoders should be generated automatically based on the provided .proto
file(s).