in reply to strip JS-comments

Update: Doh! Misconstrued the question to mean remove JavaScript. Leaving the code anyway---it might be useful to someone.

Aristotle helped me with a similar project a while back in another forum. Here's the result. It uses HTML::TokeParser::Simple to do the stripping and works in a very intuitive manner.

#!/usr/bin/perl use strict; use HTML::TokeParser::Simple;
use constant SKIP => 0; use constant COPY => 1; my $new_folder = 'stripped/'; mkdir $new_folder unless -d $new_folder; die "Could not create dir $new_folder: $!" unless -d $new_folder; my $state = COPY; foreach my $doc ( glob("*.html") ) { print "Converting $doc\n"; my $new_file = "$new_folder$doc"; my $p = HTML::TokeParser::Simple->new( $doc ); open OUTFILE, "> $new_file" or die "Cannot open $new_file for writ +ing: $!"; while ( my $token = $p->get_token ) { if ( $token->is_start_tag ('script') ) { $state = SKIP; # stop copying if script } if ($token->is_end_tag ('script') ) { $state = COPY; # start copying again after script } elsif ( $state == SKIP ) { next; } elsif ( $state == COPY ) { print OUTFILE $token->as_is; } } close OUTFILE; }

--
Allolex