|
I tried tocapture money amounts that can have leading minus sign or parenthesis around the amount. the regular amount would be the \b(?<amt>\d{1,3}((,\d\d\d){0,4}\.\d\d)\b
I tried for the caturing the leading minus sign with (?<amt>((-|\b)(\d{1,3}(,\d\d\d){0,4})[.]\d\d))\b \b(?=(-|\d))(?<amt>((-|\b)(\d{1,3}(,\d\d\d){0,4})[.]\d\d))\b I just don't seem to be get the leading minus sign | | fs - new to w7 | Please provide samples of data you want to catch. John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com | | JohnGrove | thank you
here are some sample of data
not the following: 13.999 the following are the valid ones 4,123.95 7,654,321.72 987,999.11 should skip the folowing one: -13.999Want to capture the next 3 -4,123.95 -7,654,321.72 -987,999.11 should never capture these 12093847 .12
eventually I will try to capture also amouts like these (12,789.27) (43.28) | | fs - new to w7 | You are saying that 12093847.12 shud not be captured, and also -13.999 should be skipped.Do you mean amount wothout comma and amount withmore than2 decimal placesshould not be captured??
anyways to capture leading - sign you can add "\-?" in the begining of the regex pattern.
-Paras | | paras kumar | Here's a C# example...
string pattern = @"(?=\b|\(|-)((?<paren>\()|(?<dash>-)|(?<none>()))\d{1,3}(,\d{3}){0,4}[.]\d{2}(?(paren)(\))|(\b))";
string test = @"
not the following: 13.999 the following are the valid ones 4,123.95 7,654,321.72 987,999.11
should skip the folowing one: -13.999 Want to capture the next 3 -4,123.95 -7,654,321.72 -987,999.11
should never capture these 12093847 .12
eventually I will try to capture also amouts like these (12,789.27) (43.28)";
foreach (Match mx in Regex.Matches(test, pattern))
Console.WriteLine("{0}", mx.Value);
Les Potter, Xalnix Corporation, Yet Another C# Blog | | xalnix | thank you. I am trying use your suggested pattern in named Explicit capture as I will be using the final expression along with others to extract information for some text files. Actualy the final money amount regex will be a defined name regex to be caled from other regex expression
(?=\b|\(|-)(?<amt>((?<paren>\()|(?<dash>-)|(?<none>()))\d{1,3}(,\d{3}){0,4}[.]\d{2}(?(paren)(\))|(\b)))
only give this one: 987,999.11 from the sample test data
same goes for (?<amt>(?=\b|\(|-)((?<paren>\()|(?<dash>-)|(?<none>()))\d{1,3}(,\d{3}){0,4}[.]\d{2}(?(paren)(\))|(\b)))
finally as explicit capther regex: (?<amt>(-\d{1,3}(,\d{3}){0,4}[.]\d{2}\b)|(\(\d{1,3}(,\d{3}){0,4}[.]\d{2}\))|(\b\d{1,3}(,\d{3}){0,4}[.]\d{2}\b)) I got total success when repeated the minus leading test data with surrounding parentesis.
not the following: 13.999 the following are the valid ones 4,123.95 7,654,321.72 987,999.11 should skip the folowing one: -13.999Want to capture the next 3 -4,123.95 -7,654,321.72 -987,999.11 should never capture these 12093847 .12 not the following: 13.999 the following are the valid ones 4,123.95 7,654,321.72 987,999.11 should skip the folowing one: -13.999Want to capture the next 3 -4,123.95 -7,654,321.72 -987,999.11 should never capture these 12093847 .12 1234,432,16 should skip tis: (13.999) these are valid: (4,123.95) (7,654,321.72) (987,999.11) except: (1234,432,16)
Thanks, Les
I would appreciatea shorter regex, and many thanks in advance | | fs - new to w7 | yes! has to be the proper format as in US or Cdn currency without the currency sign but do allow a set of surrounding parenthesis for -ve number also | | fs - new to w7 | On the data you provided, this pattern maybe work.
-?\(?[\d,]+?\.\d\d\b\)?
but there're still problems it can match or partly match these unwanted data: 1-3.99 -(13.99) (13.99 13.99) 13(13.99) if you assure there's no such data in your source, it can do.
Or you'd bettermatch minus-leading and surrounding parentesis seperately. Pattern 1: -?[\d,]+?\.\d\d\b pattern 2: \([\d,]+?\.\d\d\b\) or you can combine them with (?: Pattern 1)|(?:Pattern2) structure, but it looks too long. To avoid digit before minus sign or parentesis, you can add other strictionbefore your pattern it depends on the context of your data, in text, in table or in lines?
Here's a screenshot of my pattern work on you listed data http://www.wonderstudio.cn/soft/grep/exp/090909.gif www.wonderstudio.cn | | Eping Wang | BTW, I used [\d,]+?\.\d\d\b to match currency, but it's just a simple way, it can match any-placed comma in digits, such as 1,2,3 1,23 123, if you have this possibility in your data, you can use an accurate pattern \d{1,3}(,\d{3})*\.\d\d\b to match the currency. Then my pattern could be
-?\(?d{1,3}(,\d{3})*\.\d\d\b\)?
I thinkXalnix's pattern can solve the problem ofbroken parentesis pair and avoid minus sign with left parentesis in one pattern. www.wonderstudio.cn | | Eping Wang |
I thinkXalnix's pattern can solve the problem ofbroken parentesis pair and avoid minus sign with left parentesis in one pattern.
www.wonderstudio.cn
That was my intent. (?=\b|\(|-)((?<paren>\()|(?<dash>-)|(?<none>()))\d{1,3}(,\d{3}){0,4}[.]\d{2}(?(paren)(\))|(\b))
(?=\b|\(|-) #look(ahead) for the beginning of the number, it can be a word break, and open paren or a minus sign ((?<paren>\()|(?<dash>-)|(?<none>())) #begin capturing the value expecting an open paren, minus or nothing, remember what was found \d{1,3}(,\d{3}){0,4}[.]\d{2} #very similar to your original pattern, this captures the number part according to your format limitations (?(paren)(\))|(\b)) #this tests to see if you started with an open paren, if so, expect a close paren, otherwise expect a word break
This pattern is designed to pick the numbers out of a larger string which may contain multiple matches. To get at the amount...
string pattern = @"(?=\b|\(|-)(?<amt>((?<paren>\()|(?<dash>-)|(?<none>()))\d{1,3}(,\d{3}){0,4}[.]\d{2}(?(paren)(\))|(\b)))";
Console.WriteLine("{0}: {1}", mx.Value, mx.Groups["amt"].Value);
...works for me. Are you using C# or some other tool? The (?(paren)(\))|(\b)) portion will not work in MS VBScript version of Regex.
Les Potter, Xalnix Corporation, Yet Another C# Blog | | xalnix |
|