jockel has asked for the wisdom of the Perl Monks concerning the following question:

Hi all monks

I've been scratching my head for an hour now and I give up..
Got a text that I need to match with two reg.exp. but the first one finds both ... Here's the code
my $var = qq{onChange="[document.approveform.command.value='approve'; +document.approveform.submit();]" onClick="[if (this.disabled) {alert( +'some text!');}]"}; my ($onchange,$onclick); if ($var =~ /onChange/) { $var =~ /onChange\=\"\[(,*)\]\"/; $onchange = $1; } if ($var =~ /onClick/) { $var =~ /onClick\=\"\[(.*)\]\"/; $onclick = $1; } Results: >> $onchange = whole $var exept the start >> onChange="[ << and in the end >> ]" << >> $onclick = just the onClick part, just as it should be.

I understand that the onChange reg.exp is greedy and continues all the way to the end of the onClick part,, but how do I stop it earlier?

Thanks in advance..

Regards

Replies are listed 'Best First'.
Re: Regular expression problem
by davido (Cardinal) on Jun 09, 2004 at 08:16 UTC
    There's a bug in your first regexp: You're using ',*' when I think you meant to use '.*'

    Another potential bug is the fact that you're relying on $1 without first checking to see if a match succeeded. Bad dog.

    Regarding your question, you might successfully get your regexp to match less by using the non-greedy modifier on the quantifier.

    $var =~ /onChange\=\"\[(.*?)\]\"/;

    The '.*?' construct says to match as little as possible of any number of characters until the first ]" is reached.


    Dave

      Aargh .. the ",*" was a typo,, it's not acctually in the code

      ".*?" worked! Thanks! ... I thought I'd already tried that,, but apperantly not!

      Thanks to Dave and EdwardG!
      /jocke
Re: Regular expression problem
by EdwardG (Vicar) on Jun 09, 2004 at 08:15 UTC
    %perldoc -q greedy
    What does it mean that regexes are greedy? How can I get around it? Most people mean that greedy regexes match as much as they can. Technically speaking, it's actually the quantifiers ("?", "*", "+", "{}") that are greedy rather than the whole pattern; Perl prefers lo +cal greed and immediate gratification to overall greed. To get non-greed +y versions of the same quantifiers, use ("??", "*?", "+?", "{}?"). An example: $s1 = $s2 = "I am very very cold"; $s1 =~ s/ve.*y //; # I am cold $s2 =~ s/ve.*?y //; # I am very cold Notice how the second substitution stopped matching as soon as it encountered "y ". The "*?" quantifier effectively tells the regular expression engine to find a match as quickly as possible and pass control on to whatever is next in line, like you would if you were playing hot potato.

     

Re: Regular expression problem
by saskaqueer (Friar) on Jun 09, 2004 at 08:22 UTC

    This is a case of "don't use regexes to parse HTML" :)

    #!perl -w use strict; use HTML::Parser; my $html = <<'END_HTML'; <html> <body> <form action="/foo.pl" method="post"> Username: <input type="text" name="user" onClick="alert('You clicked me!')" /><br /> Password: <input type="password" name="passwd" onChange="alert('You changed me!')" /> </form> </body> </html> END_HTML my $parser = HTML::Parser->new( start_h => [ \&_parser_starttag, 'tagname, attr' ] ); $parser->parse( $html ); sub _parser_starttag { my ($tag, $attr) = @_; if ( exists $attr->{'onchange'} ) { warn( "<$tag> - onChange: ", $attr->{'onchange'}, "\n" ); } elsif ( exists $attr->{'onclick'} ) { warn( "<$tag> - onClick: ", $attr->{'onclick'}, "\n" ); } }
      Thanks for your answer..

      But this time it wasn't about html tags,,
      It was just my way of sending extra information
      to a sub that prints html code.

      /jocke

        Still, there has got to be a better way than what you are doing. Perhaps you should use the named-parameter-list style of passing parameters to subroutines. Perhaps something like this?

        print_html( onChange => qq{ document.approveform.command.value='approve'; document.approveform.submit(); }, onClick => qq{ if (this.disabled) { alert('some text!'); } } ); sub print_html { my %param = @_; print "onChange: $param{'onChange'}\n" if ( exists $param{'onChange'} ); print "onClick: $param{'onClick'}\n" if ( exists $param{'onClick'} ); }