I have a requirement that needs me to parse and validate redirection url. For example, in https://google.com?redirect_to=%2Ftarget%2Fdestination%3Fsource%3Dgoogle it would be %2Ftarget%2Fdestination%3Fsource%3Dgoogle
part (decoded to /target/destination?source=google
). Because that’s a query parameter, some malicious script (redirect_to=javascript:BOOM!
) can be put in there so I need to ensure that doesn’t happen.
I’ve been thinking about this issue for some time, but I couldn’t find a satisfactory solution.
The simplest hack would be to just check whether the parameter value starts with javascript
, but I wanted something more robust.
The problem is that, although I have valid routes defined in Route
type and have implemented parsers for all those routes, I can’t use them because Elm’s Url.Parser
library only allows parsing a full url. And because my partial url lacks scheme and host, I can’t use that.
I tried to implement an alternative version of Url.parse
function that takes three arguments paths, queries, and fragments instead of single Url
type. But I couldn’t fork the Url
library because it had Kernel codes inside it, and I didn’t want to use fork the compiler just for that.
Next option I looked into was the standalone Parse
library, but that type is not compatible with the rest of functions from Url
library and the Url.Parser
s that I’ve defined.
There’s also an option to include origin information in my Route
type:
type alias Origin = String
type Route
= SignIn Origin
| Landing Origin
| MyPage Origin MyPageRoute
type MyPageRoute
= Account
| Orders
Unfortunately this approach is quite inelegant and leads to tons of boilerplate code and unnecessary pattern matchings. I’d rather just check for javascript
string than to take this approach.
I’m still trying to find a good solution, but I can’t think of one. If anyone has come up with a good way to handle a case like this, I’d love to hear about it!