.NET Framework Bookmark and Share   
 index > Regular Expressions > Regex filter not find first instance of end of string
 

Regex filter not find first instance of end of string

I have and string variable with HTML similar to this:

<H3>Oahu</H3>
<TABLE width="100%"> <TBODY>
<TR> <TD>XXXXXXXX</TD> <TD>XXXXXXXX</TD> </TR> <TR> <TD>XXXXXXXX</TD> <TD>XXXXXXXX</TD></TR></TBODY></TABLE>
<H3>Maui</H3> <TABLE> <TBODY>
<TR> <TD>XXXXXXXX</TD> <TD>XXXXXXXX</TD> </TR> <TR> <TD>XXXXXXXX</TD> <TD>XXXXXXXX</TD></TR></TBODY></TABLE>

I actually have 12 or so tables non of which are nested. They follow one after the other with an <H3> tag or sometimes an extra <H1> tag between the tables. The trouble I'm having is my code:
Regex tableRegex = new Regex(@"(<TABLE (.*)>(.*)</TABLE>)", RegexOptions.Singleline);
locations = tableRegex.Match(divCompChart).Value;

Returns all 12 tables rather than just the first table. It seems as thought the Regex engine is not matching the </TABLE> with the very first end table tag but skips all of then until it finds the last instance of the end table tag.
spyguyhi
Make your quantifiers non-greedy. This way they will consume only what's needed to satisfythe pattern. Regex is designed to find all matches; so even if one exists - it will continue on trying to find other matches as well.

"(<TABLE (.*?)>(.*?)</TABLE>)"

  • Marked As Answer byspyguyhi Thursday, September 17, 2009 8:26 PM
  • Unmarked As Answer byspyguyhi Thursday, September 17, 2009 8:27 PM
  • Marked As Answer byspyguyhi Thursday, September 17, 2009 8:27 PM
  •  
syntaxeater
Make your quantifiers non-greedy. This way they will consume only what's needed to satisfythe pattern. Regex is designed to find all matches; so even if one exists - it will continue on trying to find other matches as well.

"(<TABLE (.*?)>(.*?)</TABLE>)"

  • Marked As Answer byspyguyhi Thursday, September 17, 2009 8:26 PM
  • Unmarked As Answer byspyguyhi Thursday, September 17, 2009 8:27 PM
  • Marked As Answer byspyguyhi Thursday, September 17, 2009 8:27 PM
  •  
syntaxeater

Awsome that worked perfectly...

spyguyhi
Not sure where you got that pattern:

String pattern @"<TABLE[^>]*>(?<Data>.+)<\/TABLE>";
Regex rx = new Regex(pattern, RegexOptions.Singleline);
Match m = rx.Match(divCompChart)
while (m.Success)
{
Console.WriteLine(m.Groups["Data"].Value);
m = m.NextMatch();
}

John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove

Wow. O.K. I'm a beginner with Regular Expressions and have only discovered this in the last 2 or three days. So John your example though probably more what I would need maybe is confusing for me which is why I have more simplified code above. If you could actually explain what each line does or at least break down what everything in your string pattern does would be helpful. I read through all the MSDN Library explanations of the Regex Language codes but the explanations are not extensive and refer mostly to single characters rather than whole lines of code. For instance:

String pattern @"<TABLE
[^>]* # what does this portion do?
>
(?<Data>.+) # what does this do as well?
<\/TABLE>"; # do I need to escape the / since it is working for me now without doing so?

In the while loop are you writing to the screen the string value that is between the TABLE tags?

spyguyhi
[^>]* means any character that is not in this class, any number of repetitions
> means literal >
(?<Data>.+) is a "named" capture called Data. Any character, any one or more repetitions

Yes, youshould to escape the /

For just a demonstration I was showing how you could grab multiple matches using that single match and iterate through them if necessary.
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove
Download and install Expresso which is a free Regex tester.

You have to register it, but is free.
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove
Thanks for the link to Expresson I'll give that a try. I've been fiddling with your code... since I did notice it does let me iterate through multiple occurances of what I am looking for.
spyguyhi

You can use google to search for other answers

Custom Search

More Threads

• Match number with comma
• Regular Expression and the Ubiquitous Null
• Splitting a sentence into words and nonwords
• Regex pattern
• Richtextbox: remove trailing blank lines
• Math symbol or checking for math symbols
• Regex Wild Card help
• regexp for regexp
• RegEx for numeber with/without decimal part
• How can I do String.Contains(#) or String.Contains(SpecialChar)?