Re: really non greedy match
by ikegami (Patriarch) on Apr 24, 2010 at 21:18 UTC
|
/START(?:(?!START|END).)*END/
/START(?:(?!START).)*?END/
I wonder if the following is faster
/
START
(?>
[^SE]*
(?:S+(?!TART))?
(?:E+(?!ND))?
)*
END
/x
| [reply] [d/l] [select] |
|
|
Ah, thanks. I was on the right track, but I
was missing the "any character" right after the
innermost group and wasn't getting a match at all.
I am wondering why that is necessary?
| [reply] [d/l] |
|
|
| [reply] [d/l] |
Re: really non greedy match
by CountZero (Bishop) on Apr 25, 2010 at 07:17 UTC
|
Please, only put <code> ... </code> tags around code and put your plain text in between <p> ... </p> tags.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
|
|
Around code or other computer text. It's also fine for data, program output, etc
| [reply] |
Re: really non greedy match
by jwkrahn (Abbot) on Apr 25, 2010 at 02:35 UTC
|
$ perl -le'
my $text = "START text I don\047t want START only text I want END";
print scalar reverse reverse( $text ) =~ /(DNE.*?TRATS)/;
'
START only text I want END
| [reply] [d/l] |
|
|
That changes the problem case from START...START...END to START...END...END
| [reply] |
Re: really non greedy match
by Ratazong (Monsignor) on Apr 24, 2010 at 21:46 UTC
|
| [reply] [d/l] |
|
|
thanks, but that wouldn't work for me. I want
to use the expression as a delimiter in split.
| [reply] [d/l] |
|
|
Bad use of split. The first arg of split is a separator. You want //g.
my @parts = $text =~ /START((?:(?!START|END).)*)END/sg;
| [reply] [d/l] [select] |
Re: really non greedy match
by Marshall (Canon) on Apr 27, 2010 at 22:46 UTC
|
A "greedy" match will indeed get greedy, but it will always allow the last part of the pattern to match if that is possible. Adding a .* at the beginning of the pattern allows that .* to "gobble up" the first START while allowing for the last START to match up with some characters followed by END.
#!/usr/bin/perl -w
use strict;
my $text ="some text START text I don't want START only text I want EN
+D";
my $wanted = ($text =~ /.*START (.*) END$/)[0];
my $wanted2 = ($text =~ /.*(START .* END)$/)[0];
print "wanted=\"$wanted\"\n";
print "wanted2=\"$wanted2\"\n";
__END__
prints:
wanted="only text I want"
wanted2="START only text I want END"
Update:
this: my $wanted = ($text =~ /.*START (.*) END$/)[0];
may look a bit strange, but this is how to assign $1 to $wanted without having to use $1 as an intermediate variable. The text match is in a list context and I just slice to get the contents of the first matching paren. $2 can be done in the same way...
my ($x,$y) = ($text =~ /.*(START (.*) END)$/)[0,1];
print "x=$x y=$y\n";
#prints: x=START only text I want END y=only text I want
I like this syntax as it "gets to the point" without $1,$2,$3, etc. | [reply] [d/l] [select] |
|
|
my $wanted = ($text =~ /.*START (.*) END$/)[0];
my $wanted2 = ($text =~ /.*(START .* END)$/)[0];
my ($x,$y) = ($text =~ /.*(START (.*) END)$/)[0,1];
can be written as
my ($wanted) = $text =~ /.*START (.*) END$/;
my ($wanted2) = $text =~ /.*(START .* END)$/;
my ($x,$y) = $text =~ /.*(START (.*) END)$/;
| [reply] [d/l] [select] |
|
|
thank you Marshall. I didn't see your post until just now.
| [reply] |