Re: HTML::Parser problem

Maybe I am over simplifying this, but all those modules and all that code looks a little overkill, if the HTML file mentioned is the only file. Something quick'n'dirty like this would do too (to start with, of course), right?

$\="\n";
print "<Products>";

open FH, "<prodnr.html";
while(<FH>) {
   if($_ =~ /HREF="(.*pdf)".*(\d{7}) ([^<]+)/) {
     print "<Product>";
     print "\t<Name>$3</Name>";
     print "\t<PDF>$1</PDF>";
     print "\t<Number>$2</Number>";
     print "</Product>";
   }
}
close FH;

print "</Products>";
[download]

--
b10m

All code is usually tested, but rarely trusted.

Comment on Re: HTML::Parser problem Download Code

Replies are listed 'Best First'.
Re: Re: HTML::Parser problem by Peamasii (Sexton) on Mar 29, 2004 at 09:05 UTC
You're not oversimplifying since using regex is hardly simpler to me, than using HTML:Parser or its brethren ;-) Point well taken though!	[reply]