(rather than just take what's given to me, i'll also write one off the top of my head please critique or write a better one)
anyways ... that's it, off the top of my head i think some criticism would help my coding skills and this utility (if it works) would help this guy i know (actually, it would also help me cause i want to be able to download a copy of his FAQ on one page ;)# script requires an indexfile, with each # html filename to be concatenated in it # one html filename per line open(EACH_HTML_FILENAME,pagelist.txt) || die; open(UNIFIED_FILE,>>output.html) || die; # i could've used 'getopt' or @ARGV but this # is being written with brevity in mind ... # another thing i'd like to try to learn is # automatically geting a directorylist, parsing out # each filename with the extensions @ARGV and # concatenating them $first_pass = 1; while (<EACH_HTML_FILENAME>) { $thisfile = chomp($_); open(THIS_FILE,$thisfile) || die; if ($first_pass) { while (<THIS_FILE>) { if ($_ ne '</body>') { print{UNIFIED_FILE} "$_" ; } else { $first_pass = 0; # exit this iteration of the while loop } } } else { $body_start = 0; while (<THIS_FILE>) { if (!($body_start) { if ($_ eq '<body>) { $body_start = 1; } } else { if ($_ ne '</body>') { print{UNIFIED_FILE} "$_" ; } else { # exit this iteration of the while loop } # END -> if ($_ ne '</body>') } # END -> if (!($body_start) } # END -> while (<THIS_FILE>) } # END -> while (<EACH_HTML_FILENAME>) print{UNIFIED_FILE} "</body>" ;
In reply to html parse - concatenation by Buckaroo Buddha
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |