in reply to Parsing ASP files
Yup, i'm pretty sure you'll have to preprocess
You can glean some regexes from http://search.cpan.org/~johnd/ASP4-1.073/
http://cpansearch.perl.org/src/JOHND/ASP4-1.073/lib/ASP4/PageParser.pm
#!/usr/bin/perl -- use strict; use warnings; my $asp = <<'__HTML__'; <input value="abc" /> abc abc <% abc ( '<input value="abc" /> abc abc <span> %>', $abc ); %> __HTML__ use HTML::Tree; my $t = 'HTML::TreeBuilder'->new; $t->ignore_unknown(0); $t->parse( $asp ); $t->dump; print "\n\n"; $asp =~ s{ <% ((?: '[^']+' | "[^"]+' | \$?[\w\s\(\);,]+ | [^%]+ )+) %> }{ my $content = $1; HTML::Entities::encode_entities( $content ); qq'<yoASP content="$content">' }gsex; $t->parse( $asp ); $t->dump; print "\n\n"; #~ and reverse it $asp =~ s/<yoASP content="([^"]+)">/ my $content = $1; HTML::Entities::decode_entities( $content ); $content; /gsei; print "$asp\n\n"; __END__ <html> @0 (IMPLICIT) <head> @0.0 (IMPLICIT) <body> @0.1 (IMPLICIT) <input value="abc" /> @0.1.0 " abc á abc <% abc ( '" <input value="abc" /> @0.1.2 " abc á abc " <span> @0.1.4 " %>', $abc ); %>" <html> @0 (IMPLICIT) <head> @0.0 (IMPLICIT) <body> @0.1 (IMPLICIT) <input value="abc" /> @0.1.0 " abc á abc <% abc ( '" <input value="abc" /> @0.1.2 " abc á abc " <span> @0.1.4 " %>', $abc ); %> " <input value="abc" /> @0.1.4.1 " abc á abc " <yoasp content=" abc ( '<input value="abc" /> +; abc &nbsp; abc <span> %>', $abc ); "> @0.1.4.3 <input value="abc" /> abc abc abc ( '<input value="abc" /> abc abc <span> %>', $abc );
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Parsing ASP files
by thewebsi (Scribe) on Jan 26, 2012 at 04:57 UTC | |
by thewebsi (Scribe) on Apr 05, 2012 at 03:47 UTC |