russmann has asked for the wisdom of the Perl Monks concerning the following question:
The first thing I want is:
The text between the 3rd <p> and </p> tags. (not the first or 2nd).
This is to extract the article title.
The 2nd thing I want is:
Everything in the page past this:
<b>Notes:</b>
This is to extract the notes.
The 3rd thing I want is:
The text between the 5th <p> and </p> tags, but only if the text begins with "by" (as in by Larry Wall).
This is to extract the author line.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Very specific HTML parsing question
by ChemBoy (Priest) on Sep 07, 2001 at 20:41 UTC |