in reply to How do I read the contents of an HTML file between two BODY tags?

well I would use a construct I came across recently ...
for example

open(HTML,"test.html") || die "cant open file\n"; while(<HTML>){ if(/<body.*?>/i ... /<\/body.*?>/i){ print OUTFILE $_; } } close HTML;
The if (/regexp/.../regexp2/) evaluates to true if $_ is between the two regexp (even across lines)
The .*? just matches the tags inside the body tag eg bgcolor etc...

sweet
Diarmiuid

Replies are listed 'Best First'.
Re^2: How do I read the contents of an HTML file between two BODY tags?
by FloydATC (Deacon) on Jun 30, 2007 at 12:06 UTC
    I realize this thread is long dead, but I just stumbled across it and found it useful. However, that construct would include the body tags in the output. To exclude them, add another expression:
    open(HTML,"test.html") || die "cant open file\n"; while(<HTML>){ if(/<body.*?>/i ... /<\/body.*?>/i){ s/\<\/*body.*?>//ig; # Strip body tags from $_ print OUTFILE $_; } } close HTML;