dyno has asked for the wisdom of the Perl Monks concerning the following question:

This is my snippet
$_ = "X1aaX2bbX3heeX4...X5loveU" ; @match = /X\d.+?/g; print "String: " . $_; foreach $word (@match){ print "\n$word"; }
I got the result:
X1aaX2bbX3heeX4...X5loveU X1a X2b X3h X4. X5l
I want the result to be below:
X1aaX2bbX3heeX4...X5loveU X1aa X2bb X3hee X4... X5loveU
How can I fix it?

Replies are listed 'Best First'.
Re: use regex to split sentence
by jmcnamara (Monsignor) on Nov 01, 2002 at 09:37 UTC

    The following regex should do what you require. It uses a zero-width positive look-ahead as explained in perlre.     @match = /X\d.+?(?=X\d|$)/g;

    Also, following on from jryan's idea here is a version that uses split:

    $_ = "X1aaX2bbX3heeX4...X5loveUX6XXXX7X8" ; my @match = split /(X\d)/; print "String: " . $_; foreach my $word (@match){ print $word; print "\n" unless $word =~ /X\d/; }

    --
    John.

      Or, improving on both of ours, just:

      @match = split (/(?=X\d)/,$_);

      For some retarded reason last night, I thought that split "kept" the part that it split on, forgetting that the trick to do that was to use a lookahead. :)

      In Perl6:

      ### <Perl6> @match = m:e/ X \d : .+? <before X \d | $ > /; ### </Perl6>
      I do think they should give shortened versions of the <before > and <after > assertions. Oh well, maybe they'll throw it in.

      (Note for onlookers: The colon within the regex is thrown in for a small bit of engine optimization, and actually also helps in detecting malformed strings.)

      kelan


      Perl6 Grammar Student

Re: use regex to split sentence
by jryan (Vicar) on Nov 01, 2002 at 07:03 UTC

    Its much easier to use split:

    $_ = "X1aaX2bbX3heeX4...X5loveU"; @match = split (/X\d/,$_);
      That's not what I want.
      aa bb hee ... loveU
      in fact I realy want to know what pattern can match the result I expected--I mean below:
      X1aa X2bb X3hee X4... X5loveU

        Try this:

        @match = /X\d[^X]+/g;

        Basically it slurps up everything that's not an 'X' after doing matching 'X\d'.