Re^3: Extracting HTML content between the h tags

Thank you very much!
Just tried the both approaches, it works even if the last h2-tag is missing ( appears in about 10 pages from > 400, for which I used the following workaround:

my @solution_2 = $content->findvalues( './h2[4]/preceding-sibling::*' 
+);
unless ( @solution_2 )
{
   @solution_2 = $content->findvalues( '//hr/preceding-sibling::*' );
}
[download]

... with substr as before ...
Fortunately they have only one hr-tag in the page :-)
With your approach it is not necessary anymore.
BTW the content after the <h2>[4] is not important.
Thanks again!

Comment on Re^3: Extracting HTML content between the h tags Select or Download Code