.NET Framework Bookmark and Share   
 index > Regular Expressions > Search through html files in a folder and remove a line of text from any html files which contains that text using a script or simular
 

Search through html files in a folder and remove a line of text from any html files which contains that text using a script or simular

Search through html files in a folder and remove a line of text from any html files which contains that text using a script or simular
Pie Eater
It would help if you specified the "text". But here is a skeleton of a couple approaches. For very simple (exact text) replacements, use the Non-Regex version. For even slightly more complex, use the Regex version. Your work is defining a pattern that works for you. The pattern in my Regex will remove all blank lines. A blank line is any line containing 0 or more whitespace characters. I am using delegates to switch between the Regex and Non-Regex approaches. You can greatly simplify by deciding what you want to do...

        private void button1_Click(object sender, EventArgs e)
        {
            FolderBrowserDialog fbd = new FolderBrowserDialog();
            fbd.ShowDialog();

            FixHtmlRegex(fbd.SelectedPath);
        }

        private void FixHtmlRegex(string dir)
        {
            string pattern = @"^\s*?$";
            Func<string, string> doReplace = delegate(string val)
            {
                return Regex.Replace(val, pattern, "", RegexOptions.Multiline);
            };
            Predicate<string> isMatch = delegate(string val)
            {
                return Regex.IsMatch(val, pattern, RegexOptions.Multiline);
            };
            FixHtml(dir, isMatch, doReplace);
        }

        private void FixHtml(string dir)
        {
            Func<string, string> doReplace = delegate(string val)
            {
                return val.Replace("\t\r\n", "");
            };
            Predicate<string> isMatch = delegate(string val)
            {
                return val.Contains("\t\r\n");
            };
            FixHtml(dir, isMatch, doReplace);
        }

        private void FixHtml(string dir, Predicate<string> isMatch, Func<string,string> doReplace)
        {
            string[] names = new string [0];
            names = Directory.GetFiles(dir, "*.htm");
            foreach (string name in names)
            {
                string html = File.ReadAllText(name);
                if (isMatch(html))
                {
                    string newHtml = doReplace(html);
                    if (Directory.Exists(Path.Combine(dir, "BackupFiles")) == false)
                    {
                        Directory.CreateDirectory(Path.Combine(dir, "BackupFiles"));
                    }
                    File.Move(name, Path.Combine(Path.Combine(dir, "BackupFiles"), Path.GetFileName(name)));
                    File.WriteAllText(name, newHtml);
                }
            }
        }
    }


Les Potter, Xalnix Corporation, Yet Another C# Blog
xalnix
Hi Mate!

Sorry here are the full details I have a folder which a software package we use puts are company name on the bottom in text "Blue Tank Fish" although our name has now changed to "Gone Fishing" when the package was design thetext on the bottom must be hard coded into the software so i was after a script that would be in the same directors as all these files which would go through all the files and make the name change.

Hope this makes sence.

Mark
Pie Eater

Haven't you tried any find and replace tool?
Many editors has a global replace function in files.
And there're many professional find & replac tool to do a advanced find and replace.
I think this replacement is very simple, just replace "Blue Tank Fish" to "Gone Fishing",
we even need't regex here.


www.wonderstudio.cn
Eping Wang

You can use google to search for other answers

Custom Search

More Threads

• RegEx replace question
• how to tell to regex (regular expression) that ignore some charachters ?
• negative lookbehind?
• Help me to create regexp for deselect some part in a line
• Change multiple characters to multiple characters
• RegEx pattern for [assembly: AssemblyVersion("1.0.0.0")]
• Reg expression to search ">" in my string.
• Regex not matching though it should
• regular expression for date in the given format
• need help