Url parsing just for path, query, and fragment

harfangk · April 17, 2019, 11:52am

I have a requirement that needs me to parse and validate redirection url. For example, in https://google.com?redirect_to=%2Ftarget%2Fdestination%3Fsource%3Dgoogle it would be %2Ftarget%2Fdestination%3Fsource%3Dgoogle part (decoded to /target/destination?source=google). Because that’s a query parameter, some malicious script (redirect_to=javascript:BOOM!) can be put in there so I need to ensure that doesn’t happen.

I’ve been thinking about this issue for some time, but I couldn’t find a satisfactory solution.

The simplest hack would be to just check whether the parameter value starts with javascript, but I wanted something more robust.

The problem is that, although I have valid routes defined in Route type and have implemented parsers for all those routes, I can’t use them because Elm’s Url.Parser library only allows parsing a full url. And because my partial url lacks scheme and host, I can’t use that.

I tried to implement an alternative version of Url.parse function that takes three arguments paths, queries, and fragments instead of single Url type. But I couldn’t fork the Url library because it had Kernel codes inside it, and I didn’t want to use fork the compiler just for that.

Next option I looked into was the standalone Parse library, but that type is not compatible with the rest of functions from Url library and the Url.Parsers that I’ve defined.

There’s also an option to include origin information in my Route type:

type alias Origin = String

type Route 
    = SignIn Origin
    | Landing Origin
    | MyPage Origin MyPageRoute

type MyPageRoute
    = Account
    | Orders

Unfortunately this approach is quite inelegant and leads to tons of boilerplate code and unnecessary pattern matchings. I’d rather just check for javascript string than to take this approach.

I’m still trying to find a good solution, but I can’t think of one. If anyone has come up with a good way to handle a case like this, I’d love to hear about it!

mthiems · April 17, 2019, 12:50pm

If you’ve gotten it to a plain String like "/target/destination?source=google", could you prepend a fake scheme & host to the String and then just use Url.Parser as-is ?

The fake host String may need to end with a trailing slash.

harfangk · April 18, 2019, 6:29am

That’s still hacky, but sounds much better than checking if the string starts with javascript://!

system · April 28, 2019, 6:29am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can I use Url Query Parsers by themselves? Learn	1	620	September 22, 2018
Parsing URLs with elm/url Show and Tell	6	869	May 31, 2021
Making a query parser fail Learn	5	747	July 27, 2019
Elm-app-url: A simpler way of parsing URLs Show and Tell	12	2305	February 3, 2023
How to test run elm/url in REPL? Learn	3	607	December 3, 2018

Url parsing just for path, query, and fragment

Related topics