Thomas’ Developer Blog

January 17, 2010

Whitelist HTML Tags (Advance Methods for Prevention against Javascript Injections)

NOTE: This method is no longer preferred. Please see Microsoft Anti-Cross Site Scripting Library V4.2
http://www.microsoft.com/download/en/details.aspx?id=28589 

Long time no update! I’m shocked to see I’m still getting over 100 posts a day considering I haven’t updated in months.

Well I wrote a little script to help everyone out who is using the HTMLeditor that ships with asp.net’s AJAX Control Toolkit. Hope you enjoy!

Function HTMLStream(ByVal InputValue As String, Optional ByVal WhiteList As String = "p|span|ol|li|ul|hr|div|i|b|h1|h2|h3|h4|a|br|img|font") As String
Dim ReturnValue As String
ReturnValue = Regex.Replace(InputValue, "<(?!(" & WhiteList & ")\b)[^>]+>([^.]|[.])*(<(?!/?(" & WhiteList & ")\b)[^>]+>)", "", RegexOptions.IgnoreCase)
While (Regex.IsMatch(ReturnValue, "(<[\s\S]*?) on.*?\=(['""])[\s\S]*?\2([\s\S]*?>)", RegexOptions.Compiled Or RegexOptions.IgnoreCase))
ReturnValue = Regex.Replace(ReturnValue, "(<[\s\S]*?) on.*?\=(['""])[\s\S]*?\2([\s\S]*?>)", _
Function(match As Match) [String].Concat(match.Groups(1).Value, match.Groups(3).Value), RegexOptions.Compiled Or RegexOptions.IgnoreCase)
End While
ReturnValue = Regex.Replace(ReturnValue, "(?<=<.*)href=""(?!http://|www\.)[^""]*""", "", RegexOptions.IgnoreCase)
Return ReturnValue
End Function

Now if you want to know how this script works you can continue reading. As a warning I will be assuming that you know regex and intermediate VB.Net code (If you want C# there are a lot of conversion applications online.)

Part 1
The function starts off with two variables. InputValue, which is self described, and the optional WhiteList. WhiteList is a list of HTML characters which will be accepted. By default it’s pretty generous.

Part 2
ReturnValue = Regex.Replace(InputValue, “<(?!(” & WhiteList & “)\b)[^>]+>([^.]|[.])*(<(?!/?(” & WhiteList & “)\b)[^>]+>)”, “”, RegexOptions.IgnoreCase)

This line searches every HTML tag and checks to see if it matches any of the values in the WhiteList group. If it doesn’t it clears out the tag and ALL of it’s contents. This is setup to be greedy! Why greedy? Because it’s for security! I don’t want to remove just the tag, I want to remove EVERYTHING inside of the tag. So be WARNED, altering the WhiteList tags may result in lost of user input.

Part 3

While (Regex.IsMatch(ReturnValue, "(<[\s\S]*?) on.*?\=(['""])[\s\S]*?\2([\s\S]*?>)", RegexOptions.Compiled Or RegexOptions.IgnoreCase))
ReturnValue = Regex.Replace(ReturnValue, "(<[\s\S]*?) on.*?\=(['""])[\s\S]*?\2([\s\S]*?>)", _
Function(match As Match) [String].Concat(match.Groups(1).Value, match.Groups(3).Value), RegexOptions.Compiled Or RegexOptions.IgnoreCase)
End While

This next part is a bit confusing. Generally this goes the extra step most scripts don’t bother to do. Which is a shame since it fails to remove those pesky JavaScript event handlers.

Part 4
ReturnValue = Regex.Replace(ReturnValue, “(?<=<.*)href=””(?!http://|www\.)[^””]*”””, “”, RegexOptions.IgnoreCase)

The final part is to go through and remove all javascript injections using the href objection for anchor tags. This will only allow links starting with “www.” or “http://&#8221;. You can modify this if you want to allow others such as ftp etc. Obviously this is to prevent against those href=”javascript:…..” injections.

So now that you got the basics you can go through and figure out the nitty gritty! Remember as one developer wrote in a blurb, DO NOT ever let the attacker no if they failed or passed. Otherwise you’re basically inviting them to try to figure out your code. You don’t want to do that!

Please read:
While I put a great deal of effort into this script, I did not write everything from scratch. A lot of people around the web have helped write the code you see above. I simply tweaked what they had and combined it into a far more secure function. So thanks to everyone who posted the original code that helped me write this. Sadly there are too many to know off hand.

21 Comments »

  1. So I didnt know anything about javascript injection until recently when someone used this technique to affect our site. Now im doing research on this topic an came across your posting. Im familiar with regx and understand what’s happening in your function above. What I dont understand is how do your use this function. When does it get called? etc. Thx

    Comment by Nick A — February 2, 2010 @ 6:06 pm

    • Nvm I just noticed that you intially wrote “a little script to help everyone out who is using the HTMLeditor that ships with asp.net’s AJAX Control Toolkit.”
      – my bad

      Comment by Nick A — February 2, 2010 @ 6:08 pm

  2. This does not really matter what control the value is coming from the function accepts a string that it will attempt to sanitize and returns the clean string.

    You would use this in a code behind on a vb.net site in it’s current form.

    Comment by Anonymous — April 8, 2010 @ 2:18 am

  3. Hello colleagues, how is the whole thing, and what you would like to
    say about this post, in my view its in fact amazing for me.

    Comment by how to configure arris best cable modem for comcast — August 15, 2014 @ 6:13 pm

  4. What’s Going down i’m new to this, I stumbled upon this I have
    found It absolutely helpful and it has aided me out loads.
    I hope to contribute & assist different customers like
    its helped me. Great job.

    Comment by pregnancy belly expansion comic strips — September 4, 2014 @ 11:54 pm

  5. I just like the valuable information you supply on your articles.

    I’ll bookmark your weblog and take a look at again here regularly.

    I’m somewhat sure I’ll learn many new stuff right right here!
    Good luck for the next!

    Comment by how to configure best best how to connect xbox 360 — September 8, 2014 @ 5:53 am

  6. Do you mind if I quote a couple of your articles as long as I provide credit
    and sources back to your weblog? My website is in the exact same
    area of interest as yours and my users would truly benefit from some of the information you present here.
    Please let me know if this ok with you. Regards!

    Comment by how to configure best best arris cable modem for comcast — September 19, 2014 @ 4:35 am

  7. I believe everything composed was actually very reasonable.

    However, what about this? suppose you were to write a killer headline?
    I mean, I don’t want to tell you how to run your
    website, however suppose you added a headline to possibly grab a person’s attention? I mean Whitelist
    HTML Tags (Advance Methods for Prevention against Javascript Injections) | Thomas’ Developer Blog is kinda vanilla.

    You could look at Yahoo’s home page and see how they write news headlines to grab people to click.
    You might try adding a video or a related picture or two to get people excited about what you’ve
    got to say. In my opinion, it could make your website a little livelier.

    Comment by how to configure best best dsl how to connect xbox 360 — September 19, 2014 @ 4:41 am

  8. You really make it seem so easy with your presentation but I find this topic
    to be actually something which I think I would never understand.

    It seems too complex and very broad for me. I am looking forward for your next post, I will try to get
    the hang of it!

    Comment by how to configure best buying a best docsis 3 cable modem — September 19, 2014 @ 11:48 am

  9. Write more, thats all I have to say. Literally, it seems as though you relied on the video
    to make your point. You clearly know what youre talking about, why throw away your intelligence on just posting videos to your blog when you
    could be giving us something enlightening to read?

    Comment by how to configure best buying a cable modem for comcast internet — September 19, 2014 @ 9:55 pm

  10. I do not know if it’s just me or if everyone else encountering problems with your website.
    It appears like some of the written text within your content are
    running off the screen. Can somebody else please comment and let me know if this is happening
    to them as well? This might be a problem with
    my web browser because I’ve had this happen previously.
    Thanks

    Comment by how to configure best best comcast the best modem for xbox — September 20, 2014 @ 4:51 am

  11. It’s amazing designed for me to have a site, which is useful in support of my knowledge.
    thanks admin

    Comment by how to configure best comcast best cable modem for mediacom — September 20, 2014 @ 8:39 am

  12. Good way of explaining, and nice article to obtain information regarding my presentation topic, which i am going to deliver in academy.

    Comment by how to configure best buying a fastest cable modem for comcast — September 20, 2014 @ 10:06 am

  13. I have read several good stuff here. Definitely worth bookmarking
    for revisiting. I wonder how a lot attempt
    you put to create the sort of magnificent informative web site.

    Comment by how to configure best best wireless motorola cable modem for — September 21, 2014 @ 3:36 pm

  14. Thanks for your marvelous posting! I definitely enjoyed
    reading it, you are a great author. I will make sure to bookmark your blog and will come back in the future.
    I want to encourage you continue your great work, have a nice evening!

    Comment by how to configure best cable which is the best modem for — September 23, 2014 @ 3:50 am

  15. Wow that was strange. I just wrote an really long
    comment but after I clicked submit my comment didn’t appear.
    Grrrr… well I’m not writing all that over again. Anyway,
    just wanted to say great blog!

    Comment by how to configure best buy motorola cable modem for — September 23, 2014 @ 1:09 pm

  16. Wow that was odd. I just wrote an very long comment but after I clicked submit my comment didn’t show up.
    Grrrr… well I’m not writing all that over
    again. Anyway, just wanted to say great blog!

    Comment by how to configure best best dsl best cable modem for — September 30, 2014 @ 10:00 pm

  17. whoah this weblog is great i like reading your posts.
    Keep up the good work! You know, a lot of individuals are hunting around for this information, you can help them
    greatly.

    Comment by how to configure best cable dsl how to connect xbox 360 — October 1, 2014 @ 7:37 am

  18. Awesome blog! Is your theme custom made or did you download it from somewhere?
    A theme like yours with a few simple adjustements would really make my blog shine.
    Please let me know where you got your design. Appreciate it

    Comment by Il mondo dell’archeologia — October 5, 2014 @ 2:55 am

  19. When someone writes an post he/she keeps the idea of
    a user in his/her mind that how a user can be aware of
    it. Thus that’s why this paragraph is perfect. Thanks!

    Comment by Berlin — October 9, 2014 @ 2:38 am

  20. Without having to spend something, conflict of clans 2014
    compromise is ultimately here get the exceptional type of the battle of clans tricks application for-free from your builders site directly.

    Comment by clash Of the clans — October 14, 2014 @ 11:26 pm


RSS feed for comments on this post. TrackBack URI

Leave a reply to pregnancy belly expansion comic strips Cancel reply

Blog at WordPress.com.