.NET Framework Bookmark and Share   
 index > Regular Expressions > How To Remove Unwanted Text
 

How To Remove Unwanted Text

Hi,

Hope someone here can help me.

We are busy writing an application to import delimited files from specific locations into a database table. Data in the one column has some extra characters on the end that need to be trimmed off. The string looks like so:

PAPPRD_STOCKI_LOW_151308_1.1.072

The first part of the string (PAPPRD_STOCKI_LOW) needs to stay. We need a way to trim off everything from the third underscore(_) to the end. There are several rows with fields like this, and the "STOCKI" and the "LOW" parts are different in some of them.

Is there a way to do this with a regex? Or am I barking up the wrong tree?
TiaanB

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            String testWord = "PAPPRD_STOCKI_LOW_151308_1.1.072";
            String pattern = @"(?<Word>[a-z_]*(?=_\d))";
            Regex rx = new Regex(pattern, RegexOptions.IgnoreCase);
            String word = String.Empty;
            Match m = rx.Match(testWord);
            if (m.Success)
            {
                word = m.Groups["Word"].Value;
            }
        }
    }
}

John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
  • Marked As Answer byTiaanB Thursday, September 17, 2009 6:48 AM
  •  
JohnGrove
John,
Though your pattern solve the string piece here, I think it's not an exact solution.
TiaanB wants "trim off everything from the third underscore(_) to the end"
But your pattern just reserve all words before _ and a digit..
If the first digit appear at other position, it will fail.

Even in your way, I think just replace all _\d.+?$ or (_\d+)+ to empty is more simple and faster.
But I think using _\d as splitter is not reliable.
The key ishe third underscore(_)

If you don't improve your pattern, I think the answer of this thread would be gained by Xalnix, :)


www.wonderstudio.cn
  • Marked As Answer byTiaanB Thursday, September 17, 2009 6:48 AM
  •  
Eping Wang

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            String testWord = "PAPPRD_STOCKI_LOW_151308_1.1.072";
            String pattern = @"(?<Word>[a-z_]*(?=_\d))";
            Regex rx = new Regex(pattern, RegexOptions.IgnoreCase);
            String word = String.Empty;
            Match m = rx.Match(testWord);
            if (m.Success)
            {
                word = m.Groups["Word"].Value;
            }
        }
    }
}

John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
  • Marked As Answer byTiaanB Thursday, September 17, 2009 6:48 AM
  •  
JohnGrove
Thanks John. I've been scratching my head over this one for days now. Being new to Regexes didn't help much either.

Much appreciated.
  • Unmarked As Answer byTiaanB Thursday, September 17, 2009 6:48 AM
  • Marked As Answer byTiaanB Thursday, September 17, 2009 6:48 AM
  •  
TiaanB

If this helped you can we close this thread?


John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove
John,
Though your pattern solve the string piece here, I think it's not an exact solution.
TiaanB wants "trim off everything from the third underscore(_) to the end"
But your pattern just reserve all words before _ and a digit..
If the first digit appear at other position, it will fail.

Even in your way, I think just replace all _\d.+?$ or (_\d+)+ to empty is more simple and faster.
But I think using _\d as splitter is not reliable.
The key ishe third underscore(_)

If you don't improve your pattern, I think the answer of this thread would be gained by Xalnix, :)


www.wonderstudio.cn
  • Marked As Answer byTiaanB Thursday, September 17, 2009 6:48 AM
  •  
Eping Wang
Thank you Eping. With your & John's help I can now put this to bed.
TiaanB
Thanks Eping for the critique on it. It is always nice to get other expert opinions like yours.
John Grove - TFD Group, Senior Software Engineer, EI Division, http://www.tfdg.com
JohnGrove

You can use google to search for other answers

Custom Search

More Threads

• RegEx in Welsh Language Text
• A hard one, but almost there!
• string generation using regex (regular expression)
• How to get abbreviation using C# regular express?
• Replace with Regular Expression
• Create Regular Expression for Special Characters
• Search through html files in a folder and remove a line of text from any html files which contains that text using a script or simular
• How to load an XPS document into DocumentViewer using an XPS file using a Stream?
• A Question related to regular expression for find/replace in visual studio
• How to regex everything before start of a specific string