Buckaroo Buddha has asked for the wisdom of the Perl Monks concerning the following question:
(rather than just take what's given to me, i'll also write one off the top of my head please critique or write a better one)
anyways ... that's it, off the top of my head i think some criticism would help my coding skills and this utility (if it works) would help this guy i know (actually, it would also help me cause i want to be able to download a copy of his FAQ on one page ;)# script requires an indexfile, with each # html filename to be concatenated in it # one html filename per line open(EACH_HTML_FILENAME,pagelist.txt) || die; open(UNIFIED_FILE,>>output.html) || die; # i could've used 'getopt' or @ARGV but this # is being written with brevity in mind ... # another thing i'd like to try to learn is # automatically geting a directorylist, parsing out # each filename with the extensions @ARGV and # concatenating them $first_pass = 1; while (<EACH_HTML_FILENAME>) { $thisfile = chomp($_); open(THIS_FILE,$thisfile) || die; if ($first_pass) { while (<THIS_FILE>) { if ($_ ne '</body>') { print{UNIFIED_FILE} "$_" ; } else { $first_pass = 0; # exit this iteration of the while loop } } } else { $body_start = 0; while (<THIS_FILE>) { if (!($body_start) { if ($_ eq '<body>) { $body_start = 1; } } else { if ($_ ne '</body>') { print{UNIFIED_FILE} "$_" ; } else { # exit this iteration of the while loop } # END -> if ($_ ne '</body>') } # END -> if (!($body_start) } # END -> while (<THIS_FILE>) } # END -> while (<EACH_HTML_FILENAME>) print{UNIFIED_FILE} "</body>" ;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: html parse - concatenation
by swiftone (Curate) on Jun 05, 2000 at 18:31 UTC | |
|
Re: html parse - concatenation
by Buckaroo Buddha (Scribe) on Jun 05, 2000 at 21:58 UTC | |
by swiftone (Curate) on Jun 05, 2000 at 22:14 UTC |