aka_bk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, i'm trying to extract all the words that are lying between the words CORPORATE PARTICIPANTS and CONFERENCE CALL PARTICIPANTS.
$transcript =~ /^CORPORATE PARTICIPANTS(.*)^CONFERENCE CALL PARTICIPAN +TS/i; print $1;
What did I do wrong? Thanks in advance.

Replies are listed 'Best First'.
Re: extract all the text between two words
by ikegami (Patriarch) on Nov 13, 2009 at 16:55 UTC

    "^" means start of the string. You're trying to match characters that exist before the start of the string. That's impossible. If you meant to match the start of the line, change the "^" to "\n" or use the /m modifier to change the meaning of "^"..

    Keep in mind that "." doesn't match newlines. You might have to use /s modifier to make "." match any character.

    You might also want to change ".*" to ".*?" in order to stop at the first "CONFERENCE CALL PARTICIPANTS". Right now, you're matching until the last "CONFERENCE CALL PARTICIPANTS".

    One solution, if I correctly guess what you are trying to do:

    if ($transcript =~ /^CORPORATE PARTICIPANTS(.*?)^CONFERENCE CALL PARTI +CIPANTS/msi) { print($1); }
Re: extract all the text between two words
by kennethk (Abbot) on Nov 13, 2009 at 16:56 UTC
    I note you have multiple ^ characters in your regular expression - this tells me that you expect to have new lines in your string. Therefore you need to use the m and s modifiers so that ^ will match starts of lines (as opposed to the start of the string) and so . can match new lines, respectively. See perlre.
      You guys are awesome fast. Thanks!
Re: extract all string between two words
by EvanCarroll (Chaplain) on Nov 13, 2009 at 16:54 UTC
    Your anchors (^) are messed up, they mean the beginning of the string when on the left-most position. Also, you need to do the actual splitting, after you do the extraction.
    # If you just want the stuff in between $transcript =~ /CORPORATE PARTICIPANTS(.*)CONFERENCE CALL PARTICIPANTS +/i; $1; # $transcript =~ /CORPORATE PARTICIPANTS foo bar baz CONFERENCE CALL P +ARTICIPANTS/i; # If you want foo, bar, baz my @words = split /\W/, $1; say for @words;


    Evan Carroll
    The most respected person in the whole perl community.
    www.evancarroll.com