Trimming white spaces and linebreaks from strings
I’m sure most of you already know the trim() function in VBScript. This is a highly useful function for removing whitespace from the edges of a string, but the problem is that this function fails to remove linebreaks as well.
This can be very useful with preventing spam on a website such as:
“
OMGZ lookit as thi AWOMZ LINKZ:
ILIKESPAMZCUZITS1337z.com
LOOK AT IT TODAY!
“
Without saying anything you know for a fact that is annoying as all can get out of! We would love to get rid of it.. but for this post I’m only going into how to make it look somewhat more presentable…
The solution to this problem is actually really simple using a small regular expression in VB. In this case I created two functions, since I included them into my class library and it was more functional to seperate them.
The first function is fairly simple using two loops and uses a small regular expression:
Public Function TrimEmptySpace(ByVal mySrc As String) As String
If mySrc = “” Then
return nothing
exit function
End If
Dim myStr As String = mySrc
do while Regex.IsMatch(Left(myStr,1),”[\s]“)
myStr = mid(myStr,2)
loop
do while Regex.IsMatch(Right(myStr,1),”[\s]“)
myStr = mid(myStr,1,len(myStr)-1)
loop
return myStr
End Function
As you can tell it’s a very simple script that loops through and looks for any space or linebreak with the /s special character in VBScript regular expressions. If it notices any of those characters, we’ll go ahead and trim it! We loop through the left then we go ahead and loop through the right, and return the formated string.
The second function is also fairly simple, but takes a bit of an extra road trip to get there.
Public Function ParagraphSetter(ByVal mySrc As String) As String
If mySrc = “” Then
return nothing
exit function
End If
Dim myStr As String = mySrc
Dim myChar As String = chrW(10) & chrW(10)
myStr = Replace(myStr,chrW(13),”")
do while Regex.IsMatch(myStr,”\n\n”)
myStr = Replace(myStr,myChar,chrW(10))
loop
myStr = Replace(myStr,chrW(10),myChar)
return myStr
End Function
ParagraphSetter is used to turn any line break into a double line break, and remove anything over two linebreaks in a row. This is useful for areas where you need the post to be formal, and clear of clutter. In my case it’s perfect for the comment section in news post and in new database entries for video synopsis, where I don’t want people getting “creative” in their post.
The first step of the function i to remove any return characters which is chrW(13). This is needed for IE browsers since they use chrW(10) chrW(13) whenever the return key is pressed. Firefox and others use only chrW(10). One of the annoyances of IE, but they have been getting a lot better with IE7 and hopefully IE8 is all they make it out to be.
Anyway after you remove that pesky uneeded chrW(13) we go ahead and use a loop to look for double occurances of chrW(10) which is your normal line break. We go ahead and then turn all of those doubles into single occurances. Obviously we can’t have single linebreaks else we’ll have a post like:
Today in the news:
A man named…
Was trapped…
So once we format the string to only having single linebreaks, we go back again and double those. Which makes it look like:
Todays News:
A man named….
Was trapped…
Which looks a lot cleaner.. and that previous post will turn into…
“OMGZ lookit as thi AWOMZ LINKZ:
ILIKESPAMZCUZITS1337z.com
LOOK AT IT TODAY!”
Ok.. granted that still is horrid.. but it’s a lot better than it was! You can use spam filters to finish the job for looking for occurances of access caps and looking for key words like… 1337,.com, and many other common annoyances. I have yet to work on a spam filter yet. Though due to security reasons I most likely will not post the solution, but will give a basic idea of the theory so you can create your own. Never let users know how you filter comments! Just a small bit of advice.
Well hope that helps! Happy coding
[...] http://sanzon.wordpress.com/2008/05/04/trimming-white-spaces-and-linebreaks-from-strings/ [...]
Pingback by Inline Trimming in VBScript « Thomas’ Developer Blog — May 4, 2008 @ 9:09 pm