Difficulties wrapping head around elm/parser

Hey,

I’m having difficulties with writing my own simplified markdown parser. I want to be able to parse the following things:

* or _ for emphasis/italics
** or __ for strong/bold
**_ or _** for the combined strong emphasis
# header1
## header 2
...
###### header 6
[and linking is also](www.something.com/i-want-to-implement)

That’s all I want my simple toned down parser to do, nothing special really.

The issue

I’m not really understanding how to parse multiple blocks out of a single string. Let’s say we have the following input:
Hey there, *you* must be a *really*, **really** interesting person!
My parser so far will just only parse one block and then it stops and forgets about the rest of the string. I thought of using a loop but I couldn’t get it to work. I’ve got it to successfully break on newlines, giving me paragraphs.

My “parser” so far:

module Markdown exposing (..)

import Html.Styled exposing
  ( Html
  , p
  , styled
  , text
  , span
  )
import Parser exposing (..)


-- MARKDOWN PARSER

type NodeType
  = Text String
  | Bold String
  | Italic String


testString : String
testString =
  (   "Met andere woorden, het is nu al een tijdje zo dat ik er over nadenk "
  ++  "dat het anders kan."
  ++  "\n"
  ++  "**hey** hoe is het *ermee* mijn jong?"
  ++  "\n"
  ++  "Ik wilde het graag over het volgende hebben namelijk."
  )


splitNewlines : Parser ( List String )
splitNewlines = Parser.sequence
  { start = ""
  , separator = "\n"
  , end = ""
  , spaces = chompWhile ( \c -> c == ' ' )
  , item = someString
  , trailing = Optional
  }


someString : Parser String
someString = getChompedString <| succeed ()
  |. chompUntilEndOr "\n"


-- Split up the input string into blocks, separated by new lines, giving us a list of paragraphs.
preProcess : String -> List String
preProcess input = case run splitNewlines input of
  Ok result ->
    List.filter ( \ x -> String.length x >= 1 ) result

  Err err ->
    []


-- My failed loop attempt (which isn't much of a attempt looking like this..)
-- parseMarkdown : Parser ( List NodeType )
-- parseMarkdown = succeed identity
--   |= loop [] someNode


parseMarkdown : Parser NodeType
parseMarkdown = oneOf
  [ parseBold
  , parseItalic
  , parsePlain
  ]


parseBold : Parser NodeType
parseBold = succeed Bold
  |. symbol "**"
  |= ( getChompedString <| chompWhile isText )
  |. symbol "**"


parseItalic : Parser NodeType
parseItalic = succeed Italic
  |. symbol "*"
  |= ( getChompedString <| chompWhile isText )
  |. symbol "*"


parsePlain : Parser NodeType
parsePlain = succeed Text
  |= ( getChompedString <| chompWhile ( \ _ -> True ) )


isText : Char -> Bool
isText c =
  Char.isAlphaNum c || c == ' ' || c == '\n' || c == '\r' || c == '\t'

process input = case run parseMarkdown input of
  Ok result ->
    result

  Err err ->
    let
      _ = Debug.log "err" err
    in
    Text "Something went wrong..."


parse str =
  case preProcess str of
    [] ->
      []

    list ->
      List.map process list

Thanks in advance!

Reasons I don’t want to use a package: I want to practice my elm parser skills and the available parser libraries are too extensive for my liking.

1 Like

The elm-radio episode on elm/parser has a lot of useful links.

I highly recommend @terezka’s wonderful talk: Demystifying Parsers as I vaguely remember it clarified the loop. I have not used the knowledge and I don’t have a more concrete support but maybe those resources could help you.

6 Likes

This definitely helped me a little further! Thank you very much :slight_smile:

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.