.NET Framework Bookmark and Share   
 index > Regular Expressions > Regular Expression problem
 

Regular Expression problem

Hi, just need help related with below code. It gets the title of the web pages. It works fine for some of the web sites but do not work and froze my visual web devlp. for some of the web sites. I do not understand why?


strRegex =

"<title> *((\b.*\b)*|\b.*) *</title>"

Regex =

New System.Text.RegularExpressions.Regex(strRegex)

Dim matcha As Match = Regex.Match(resultslink, strRegex)

If matcha.Success Then

objCommand.Parameters(

"@productname").Value = matcha.Groups(1).Value()

End If

do not work for this webpage
http://www.teknosepet.com/?urun-detay_39070_Samsung_PL_50_10_2Mp_2_7_inc_LCD_Dijital_Fotograf_Makinesi_Turkce_Menu__45TL_Degerinde_2_GB_Hafiza_Karti___Tripod_Hediyeli_____Stoktan_Ayni_Gun_Kargo___Anadolu_Grup_Garantisinde__.html

resultslink is a string that contains the html of the above webpage.

Erdem ISBILEN

Dim pattern As String = "<title>(?<Title>[^<]*)<\/title>"

Dim rx As New Regex(pattern, RegexOptions.IgnoreCase)
Dim m As Match = rx.Match(resultslink)
If (m.Success) Then
objCommand.Parameters.Add("@productname").Value = m.Groups("Title").Value
End If
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove
Erdem,
Your pattern "works" for some, but "freezes" for others because of "backtracking". Regular expressions making use of alternations (|) and the repetition symbols '*', and '+' can go exponential on some long strings that match or nearly match. Though technically not frozen, it might as well be frozen for very long strings. If you want to prove it to yourself, you can create several test Titles and try your pattern on them. Increase their length one character at a time until you start seeing the execution time increase. You can usually avoid the problem with a simpler pattern.

John's pattern should work for you unless you are trying to eliminate beginning and trailing spaces from your capture. In that case, use the Trim() method or function.
Les Potter, Xalnix Corporation, Yet Another C# Blog
xalnix

Dim pattern As String = "<title>(?<Title>[^<]*)<\/title>"

Dim rx As New Regex(pattern, RegexOptions.IgnoreCase)
Dim m As Match = rx.Match(resultslink)
If (m.Success) Then
objCommand.Parameters.Add("@productname").Value = m.Groups("Title").Value
End If
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove
Erdem,
Your pattern "works" for some, but "freezes" for others because of "backtracking". Regular expressions making use of alternations (|) and the repetition symbols '*', and '+' can go exponential on some long strings that match or nearly match. Though technically not frozen, it might as well be frozen for very long strings. If you want to prove it to yourself, you can create several test Titles and try your pattern on them. Increase their length one character at a time until you start seeing the execution time increase. You can usually avoid the problem with a simpler pattern.

John's pattern should work for you unless you are trying to eliminate beginning and trailing spaces from your capture. In that case, use the Trim() method or function.
Les Potter, Xalnix Corporation, Yet Another C# Blog
xalnix
thank you so much, your code works perfectly
Erdem ISBILEN
thanks fordetailed information and clarification,
Erdem ISBILEN
Good point Les:

Dim pattern As String = "<title>(?<Title>[^<]*)<\/title>"

Dim rx As New Regex(pattern, RegexOptions.IgnoreCase)
Dim m As Match = rx.Match(resultslink)
If (m.Success) Then
objCommand.Parameters.Add("@productname").Value = m.Groups("Title").Value.Trim()
End If
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove

You can use google to search for other answers

Custom Search

More Threads

• regex replace is not working <span style="mso-bookmark: OLE_LINK1">
• Regular Expression for finding specific text in a document
• VB: Regexp is not defined
• regular expression character count including spaces between words
• converting a date using regular expressions
• extract string
• Log for regex opeartion
• File search using Regular Expression
• matchevaluator info for vb.net
• RegEx for HTML code