in reply to Re: regex for search and replace of words in HTML
in thread regex for search and replace of words in HTML
# strip html tags $text =~ s/<[^>]*>//g; # strip special chars $text =~ s/&[^;]*;//g; # shove resulting words into an array my @words = $text =~ /(\w+\'*\w+)/g;
My problem of being able to find the *exact* instance of a particular word still persists. For example, there might be three occurances of the word, "testy" in a document. The first word might want to be replaced with "test," while the second and third remain "testy." Therefore, I need to treat each word separately.
Also, on my resultant global search and replace, what if someone has included the words "img src" in plain text and they want that changed to "image source"? That would blow up all of my <img src=> tags. I know it's a contrived situation, but I know our users.... I am currently working on this test script:
-Justin#!/usr/bin/perl use strict; use warnings; my $html = ''; while (<STDIN>) { $html .= $_; } my $begin = 0; my $end = 0; my @excerpts = (); for (my $i=0;$i<length($html);$i++) { if (substr($html,$i,1) eq '>') { $begin = $i + 1; } if ($begin && substr($html,$i,1) eq '<') { $end = $i; } if ($begin && $end) { push @excerpts, { begin => $begin, end => $end }; $begin = 0; $end = 0; } } # last snippet if ($begin && !$end) { push @excerpts, { begin => $begin, end => length($html) }; } foreach my $excerpt (@excerpts) { my $begin = $excerpt->{begin} || 0; my $end = $excerpt->{end} || 0; my $length = $end - $begin; my $word_string = substr($html,$begin,$length); ...still working on search and replaces for $word_string... }
holli has replaced pre tags with code tags
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: regex for search and replace of words in HTML
by GrandFather (Saint) on Jun 16, 2005 at 03:05 UTC | |
by jqcoffey (Novice) on Jun 16, 2005 at 22:13 UTC |