Gangabass has asked for the wisdom of the Perl Monks concerning the following question:

Hi, Monks.

I have a task to parse line, containing XSS vulnerability string and find a vulnerable file in it. Here some examples:

my @vulnerabilities = qw( http://www.parismatch.com/recherche/recherche?motcle="%2F><marquee +>xss+death-angel<iframe+src%3D"htt p%3A%2F%2Fwww.xssed.com"%2F>&x=5&y=6 https://www.simplydomains.co.nz/register.php/ref="><script>alert(1 +);</script> https://www.simplydomains.co.nz/logon/ref="><script>alert(1);</scr +ipt> https://www.eso.shell.com/eso/e_invoice_jsp/req_form_uk2.jsp?coun +trycode=18&language=%22%3E%3Cscript%3Ealert(%22daimon%22)%3C/script%3 +E ); foreach my $vuln (@vulnerabilities) { my ($file) = $vuln =~ m{^/.*?([^/]+)[\?/]}; print $file, "\n"; }

So the correct list for me is:

recherche register.php logon req_form_uk2.jsp

The main problem for me is to find where mod_rewrite stuff starts (where ? changed with /). How i can find it? May be i need several regex?

Thanks.
Roman

Replies are listed 'Best First'.
Re: Need help with regex/strategy
by moritz (Cardinal) on Nov 16, 2009 at 09:00 UTC
    I'd try it like this:
    1. Split on the question mark, consider only the first chunk
    2. Split that first chunk on the slash
    3. If the last part of that contains a dot, take it. If not, take the second-to-last part.
    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Need help with regex/strategy
by planetscape (Chancellor) on Nov 17, 2009 at 15:59 UTC
Re: Need help with regex/strategy
by Gangabass (Vicar) on Nov 16, 2009 at 09:38 UTC

    Hmmm... This not work for this (it must find logon string):

    https://www.simplydomains.co.nz/logon/ref="><script>alert(1);</script>
      (assuming you meant to reply to my reply...)

      You are right; but that's easily fixed by generalizing the first step a bit: split on either question mark or equality sign, (split /[?=]/, $url, 2)[0]