in reply to Regex not greedy enough

Thanks for the comments;
  Snax - you're right, I was using the opposite switch to the one I meant, but I still have the same greed problems.
  Japhy - This is new to me and it looks like the kind of solution I was after. However ?= matches nothing as (?=^ {3}\w), and I can't use \n, since then I skip my first record.

Any more ideas?
BTW: the 'record1' string is actually the first field of the record; it could be anything beginning with a \w

oh yeah, here's my test source:
#!/usr/bin/perl -w use strict; my($infile,@records); while(<DATA>) {$infile.=$;} @records = (split(/(?=^ {3}\w)/,$infile); #returns whole list #@records = ($infile =~ m/(^ {3}\w.*?)/sg); #returns only up to \w print join("\n========\n",@records); __DATA__ record1 field2 2345 record2 record3 field1 GAGGA field2 7848 field2a 5m

Replies are listed 'Best First'.
Re: Re: Regex not greedy enough
by japhy (Canon) on Nov 17, 2000 at 19:16 UTC
    It won't match (?=^ANYTHING) at any place but the very beginning of the string unless you have the /m modifier on in the regex, which allows ^ to match after newlines.

    Ohhhhhh. I didn't think you meant ALL the text was indented, I thought you meant the 'field' parts where. Well then, to make it work with such data:
    my $code; { local $/; $code = <DATA> } # fast "slurping" @records = split /\n (?=\w)/, $code; for (@records) { print ">>$_<<\n"; } __DATA__ japhy DALnet regular Regex Prince merlyn Perl Hacker O'Reilly Author Mark_Dominus IAQ Author ArrayHashMonster Creator


    japhy... Perl Hacker and Regex Prince
Re: Re: Regex not greedy enough
by snax (Hermit) on Nov 17, 2000 at 19:01 UTC
    Use japhy's suggestion and add a newline to your string:
    @records = split /(?=\nrecord)/, ("\n" . $data);
    ...that way you get the necesary first newline in the regex for the first record.

    Crude, but effective :)

Re: Re: Regex not greedy enough
by Fastolfe (Vicar) on Nov 17, 2000 at 20:25 UTC
    If you want to capture to the end of the line, in /m mode, $ anchors at the end of a line. In /s mode, ^ matches at the beginning of the string and $ matches at the end. So maybe you did want /m. Of course, you can also just do something like these:
    while (<DATA>) { my ($key, $value) = /(\S+) (.*)/ or next; # or: my ($key, $value) = split; # or: (undef, $key, $value) = split(/\s+/, $_, 3); $hash{$key} = $value; # or: push(@{$hash{$key}}, $value); }
    Untested, but you might get some ideas from that.