in reply to Capturing text between literal string delimiters (was: Regular expressions)

use Data::Dumper; my $string = "sp Hello there sp \n Hey hey sp How are you? sp"; my @result = $string =~ m< [ ]? # optional space sp # literal 'sp' [ ]? # optional space (.*?) # non-greedy capture [ ]? # optional space sp # literal 'sp' [ ]? # optional space >xg; print Dumper(\@result); __output__ $VAR1 = [ 'Hello there', 'How are you?' ];
That code does the trick, although you may want to make it more generic if you're working on more complex strings. Also checkout using split() if possible.
HTH

_________
broquaint

Replies are listed 'Best First'.
Re: Re: Regular expressions
by aersoy (Scribe) on Jun 24, 2002 at 13:05 UTC

    Hello,

    I think your Regular Expression is unnecessarily complex. Besides it chokes on a string like this: "sp Hello there! spelling spoiler spooky asp wizard! sp \n Hey hey sp How are you? sp".

    A simpler and better solution would be

    @result = split /\bsp\b/, $string;

    (where b is for boundary)

    --
    Alper Ersoy

      I think your Regular Expression is unnecessarily complex.
      Indeed it is, but it does give the specified output in the root node.
      Besides it chokes on a string like this: "sp Hello there! spelling spoiler spooky asp wizard! sp \n Hey hey sp How are you? sp".
      Unfortunately so which is why I recommended it to be made more generic (i.e not rely on space being around 'sp').
      A simpler and better solution would be
      That would be nice but unfortunately it gives this incorrect output
      $VAR1 = [ '', ' Hello there ', ' Hey hey ', ' How are you? ' ];
      As outlined below it's splitting the string on 'sp' as opposed to grabbing the text between it (as though the first 'sp' were a <sp> and the second a </sp> and so on)
      01 2 3 sp Hello there sp \n Hey hey sp How are you? sp

      _________
      broquaint

        That would be nice but unfortunately it gives this incorrect output

        You are right, I recognized that too, after posting. But there are many ways to trim the leading and following white space characters from a string. ie. map { s/^\s*|\s*$//g } @result; would do the trick.

        --
        Alper Ersoy

Re: Re: Regular expressions
by kidd (Curate) on Jun 24, 2002 at 12:53 UTC
    Thanks for your reply, that works great...