in reply to getting the first n printable words from a string of HTML
You can correlate the words in the list as shown below
my $html = "<h1>Foo</h1><p>Bar</p><p>Some more text here</p>"; my @list = ('Foo','Bar'); # eat up the bits in @list $html =~ m/$_/gc for @list; #use \G to match the rest ($rest) = $html =~ m/\G(.*)$/; print $rest;
Here we use /gc and the \G assertion to to first eat up the string by matching the words in @list in sequence, and then match the rest of the string starting just past the last match.
Note there are some circumstances where this will fail such as when you have a complete element in @list which matches an HTML tag or part therof.
It should work for most practical circumstances I think.
tachyon
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: getting the first n printable words from a string of HTML
by Vynce (Friar) on May 30, 2001 at 17:37 UTC |