Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
I tried messing around with some HTML::Parser code, as well as hstrip, but they didn't seem to get me where I need to be. I also tried HTML::TagFilter and HTML::TreeBuilder with the same level of success.. none. merlyn also has an article on something similar, but removes the tags themselves, leaving the text values. Close to what I need, but not quite there.
The glitch here is that I need the color="#RRGGBB" value in the tag, but I need to drop anything else that appears in there, leaving just the font tag and color attribute and value. The other sticky point is that many people use single-quotes around the attributes, some use none, and a simple regex would have to be quite smart to figure this out (and likely rife with errors).
Doing this with exclusively regexes is going to be prone to failure, especially since tags can be improperly nested, so I can't just yank from <font .*?> to </font> and work on the remainder.
Here's an example of what my input could look like, and what I need for final output:
<font color="#000000" face="Arial,Helvetica" size="1"> Some text </font> <font color="#000000"> Some text </font>
Can any monk lend a hand?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Stripping font "face" values from font tags
by Fletch (Bishop) on May 28, 2003 at 16:36 UTC | |
by Ovid (Cardinal) on May 28, 2003 at 16:41 UTC | |
|
Re: Stripping font "face" values from font tags
by CukiMnstr (Deacon) on May 28, 2003 at 16:31 UTC | |
|
Re: Stripping font "face" values from font tags
by BrowserUk (Patriarch) on May 28, 2003 at 18:08 UTC |