Hey,
I’m having difficulties with writing my own simplified markdown parser. I want to be able to parse the following things:
* or _ for emphasis/italics
** or __ for strong/bold
**_ or _** for the combined strong emphasis
# header1
## header 2
...
###### header 6
[and linking is also](www.something.com/i-want-to-implement)
That’s all I want my simple toned down parser to do, nothing special really.
The issue
I’m not really understanding how to parse multiple blocks out of a single string. Let’s say we have the following input:
Hey there, *you* must be a *really*, **really** interesting person!
My parser so far will just only parse one block and then it stops and forgets about the rest of the string. I thought of using a loop but I couldn’t get it to work. I’ve got it to successfully break on newlines, giving me paragraphs.
My “parser” so far:
module Markdown exposing (..)
import Html.Styled exposing
( Html
, p
, styled
, text
, span
)
import Parser exposing (..)
-- MARKDOWN PARSER
type NodeType
= Text String
| Bold String
| Italic String
testString : String
testString =
( "Met andere woorden, het is nu al een tijdje zo dat ik er over nadenk "
++ "dat het anders kan."
++ "\n"
++ "**hey** hoe is het *ermee* mijn jong?"
++ "\n"
++ "Ik wilde het graag over het volgende hebben namelijk."
)
splitNewlines : Parser ( List String )
splitNewlines = Parser.sequence
{ start = ""
, separator = "\n"
, end = ""
, spaces = chompWhile ( \c -> c == ' ' )
, item = someString
, trailing = Optional
}
someString : Parser String
someString = getChompedString <| succeed ()
|. chompUntilEndOr "\n"
-- Split up the input string into blocks, separated by new lines, giving us a list of paragraphs.
preProcess : String -> List String
preProcess input = case run splitNewlines input of
Ok result ->
List.filter ( \ x -> String.length x >= 1 ) result
Err err ->
[]
-- My failed loop attempt (which isn't much of a attempt looking like this..)
-- parseMarkdown : Parser ( List NodeType )
-- parseMarkdown = succeed identity
-- |= loop [] someNode
parseMarkdown : Parser NodeType
parseMarkdown = oneOf
[ parseBold
, parseItalic
, parsePlain
]
parseBold : Parser NodeType
parseBold = succeed Bold
|. symbol "**"
|= ( getChompedString <| chompWhile isText )
|. symbol "**"
parseItalic : Parser NodeType
parseItalic = succeed Italic
|. symbol "*"
|= ( getChompedString <| chompWhile isText )
|. symbol "*"
parsePlain : Parser NodeType
parsePlain = succeed Text
|= ( getChompedString <| chompWhile ( \ _ -> True ) )
isText : Char -> Bool
isText c =
Char.isAlphaNum c || c == ' ' || c == '\n' || c == '\r' || c == '\t'
process input = case run parseMarkdown input of
Ok result ->
result
Err err ->
let
_ = Debug.log "err" err
in
Text "Something went wrong..."
parse str =
case preProcess str of
[] ->
[]
list ->
List.map process list
Thanks in advance!
Reasons I don’t want to use a package: I want to practice my elm parser skills and the available parser libraries are too extensive for my liking.