.NET Framework Bookmark and Share   
 index > Regular Expressions > Need a regEx for this condition
 

Need a regEx for this condition

Creating a routine to strip comments from a line of VB code.

So, I know comments start with ' character

However, I also have to watch out for quoted ' chars.

So here's four different lines/scenarios

Dim A as String ' This is a Dim statement. 

A = "'" & 500 * 6 & "'" 

If A <> "''" Then   ' We should never get here!!!!!

   Msgbox "See. We Tried!"   ' what does he mean 'we'?


My resulting output should be:

Dim A as String 

A = "'" & 500 * 6 & "'" 

If A <> "''" Then 

   Msgbox "See. We Tried!"

So, this will take place on a line by lane basis.

I'm looking to at least, give me the location of a comment start, if one exists on the line.

I get thoroughly confused thinking of scenarios involving quoted comment chars.


Any help appreciated!

Eric

  • Edited byEricEric Friday, August 28, 2009 8:38 PMFormatted resulting output
  •  
EricEric
Use the find and replace of your VS editor

Quick replace: --> :b+':b.+

Replace With: --> make sure is empty

Look In --> Current Document

Use --> Use Regular Expressions
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove
Here's a piece of VB6 code for your reference,

reg.pattern = "'[^""]+?(\r\n|$)"
reg.replace(str,"")

If you set ^ and $ to match begining and end of a line intead of the whole string
the pattern can be simplified to '[^"]+?$"

Explanation:
In VB code, a comment started from a single quote, and eneded by a line break or to the end of string.
and the single quote symbol can notbe inside a string

This pattern still has a flaw, if there's a double quote in your comment, it will miss it.
But it can strip the comments in your code, I have tested it only my regex tool,
here's the screenshot,
http://www.wonderstudio.cn/soft/grep/exp/vbcomment.gif

To catch comments with double quote, more complex regex pattern should be used,
such as "look behind", I thinkXalnixwill give a perfect solution.
www.wonderstudio.cn
Eping Wang
I believe if I interpreted right, since this is code that the user wants removed, they can use the VS Find and replace. I used his example to genetate the expression I came up with and it removed all the comments.
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove

EricEric, do you want a Regex thatcan be used from the VS Find/Replace feature, or are you trying to write a program to do this? Also, do you only want to remove comments at the end of program lines (in other words, do you want to retain comments that are alone on a line or preceded only with whitespace?). As for the location where the comment starts, that can only be done with a program based Regex.

If you want a VS Find/Replace Regex that can find all comments in VB, then John's pattern is close, but I would change it to this...

(^|:b+)'.+

... otherwise, it misses several lines (those lines that begin with a comment, and comments that like '----- and '<summary> and even an empty comment.

But, if you want to use a program and obtain the indexes to the start of the comment, you will need Eping Wang's pattern. You would use...

Dim mx as Match = reg.Matches(vbString)

... to get the comment match. Then mx.Index and mx.Length can be used forworking withthe match in your string.

Lastly, Eping Wang correctly points out his pattern does not properly match comments that contain double quotes. To avoid the problem, one must only match comments when not inside of adouble quoted string. Here is an example in C# using Groups and pairing to only select a single quote to the end of line when the single quote appears outside of a string...

            string pattern = @"
                (
                    ""(?(quote)(?<-quote>)|(?<quote>))
                    | [^""']
                    | (?(quote)('))
                )*
                (?(quote)(?!))(?<comment>'.*)
                ";
            string test = @"
                Dim A As String ' This is a Dim statement. 
                A = ""'"" & 500 * 6 & ""'""
                If A <> ""''"" Then   ' We should never get here!!!!!
                    MsgBox(""See. We Tried!"")   ' what does he mean 'we'?
                End If
                Dim b As String = ""How About this"" 'here's a comment
                Dim c as string = ""another"" ' this ""comment"" has double quotes...
                ";

            foreach (Match m in Regex.Matches(test, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline))
            {
                Group mx = m.Groups["comment"];
                Console.WriteLine("{0}, {1}: {2}", mx.Index, mx.Length, mx.Value);
            }


Les Potter, Xalnix Corporation, Yet Another C# Blog
xalnix

You can use google to search for other answers

Custom Search

More Threads

• Microsoft Studio 2005, regex find in files spanning more than one line, How to get Find Results window to display ALL lines foun
• Please give me Date Regular Expression
• Match word with (!)
• Failing a regex that has a percentage sign
• How can I find and replace url by regular expression?
• Checking a string for patterns
• Problem when using Regex.Split
• Getting a string from a Regex Pattern
• Regular Expression
• Can this be done in RegEx?