Rich36 has asked for the wisdom of the Perl Monks concerning the following question:

I've got a situation where I'm getting a variable that contains a set of questions and answers. My strategy to extract the answer set involves matching the questions in the variable against a template of questions without the answers.
In practice, match question #1 and question #2 and extract what's in the middle ($answer_to_question1 == /$question1(.*)$question2/).
The problem that I was having is that the match results were inconsisent. In the initial foreach loop (labeled TEST), everything would match ok. The I would get inconsistent results with the for loop at the bottom.
Here is the code and the results for question.pl
# Contents of questions.lst
1. What is your name? 2. Please list your address and home phone number. 3. What is your mother's maiden name (name before marriage)? 4. Name your four favorite bands or artists:

Contents of questions.pl - The for loop is where the problem occurs
use strict; use Data::Dumper; my $file = "questions.lst"; my $answer = <<EOT; 1. What is your name? Rich 2. Please list your address and home phone number. 12345 Sesame Street Footown, OH 45678 (123)555-12345 3. What is your mother's maiden name (name before marriage)? Arnold 4. Name your four favorite bands or artists: The Beatles Black Crowes Red Earth Matthew Sweet EOT ####################################################### # MAIN ####################################################### my @questions = &getQuestions($file); print Dumper(@questions); # Print Divider print qq(----------\nBEGIN TEST\n----------\n); # TEST # Making sure $answers matches all the questions foreach my $question(@questions) {  if ($answer =~ m/$question/g) {   print qq(TEST: \$answer matched $question\n);  } else {   print qq(TEST: \$answer did NOT match $question\n);  } } # Print Divider print qq(----------\nEND TEST\n----------\n); # Loop trying to match current question and next question my $i; for($i = 0; $i < scalar(@questions); $i++) {  my $j = $i + 1; # $j is the next element of the array  print qq/Checking \$questions[$i] and \$questions[$j] \n/;  print qq/Matching element $i\n/;  if ($answer =~ m/\Q$questions[$i]\E/g) {   print qq(Matched $questions[$i]\n);  } else {   print qq(Did NOT match $questions[$i]\n);   }    if ($questions[$j]) {   if ($answer =~ m/\Q$questions[$j]\E/g) {    print qq(Matched $questions[$j]\n);   } else {    print qq(Did NOT match $questions[$j]\n);    }  }    print qq(-----------------------------\n); # Dividing line } sub getQuestions() { my $file = $_[0]; my @questions; open(FILE, "<$file") || die "Could not open $file: $!\n"; while(<FILE>) { s/^\s+//g; if (/^\d/) { chomp; push(@questions, $_); } } return @questions; }

Results for questions.pl - Note the inconsistency of when it matches, then doesn't.
$VAR1 = '1. What is your name?'; $VAR2 = '2. Please list your address and home phone number.'; $VAR3 = '3. What is your mother\'s maiden name (name before marriage)? +'; $VAR4 = '4. Name your four favorite bands or artists:'; ---------- BEGIN TEST ---------- TEST: $answer matched 1. What is your name? TEST: $answer matched 2. Please list your address and home phone numbe +r. TEST: $answer matched 3. What is your mother's maiden name (name befor +e marriage)? TEST: $answer matched 4. Name your four favorite bands or artists: ---------- END TEST ---------- Checking $questions[0] and $questions[1] Matching element 0 Did NOT match 1. What is your name? Matched 2. Please list your address and home phone number. ----------------------------- Checking $questions[1] and $questions[2] Matching element 1 Did NOT match 2. Please list your address and home phone number. Matched 3. What is your mother's maiden name (name before marriage)? ----------------------------- Checking $questions[2] and $questions[3] Matching element 2 Did NOT match 3. What is your mother's maiden name (name before marria +ge)? Matched 4. Name your four favorite bands or artists: ----------------------------- Checking $questions[3] and $questions[4] Matching element 3 Did NOT match 4. Name your four favorite bands or artists: -----------------------------


By and large, this is a mystery to me why this doesn't match consistently. However, I was able to get a working solution with the help of a co-worker. Here's a version of the code that works. The major difference is that I referenced the original value of $answer through every iteration of the for loop. (See my $answer2 = $answer;).
Here is the for loop in the version of questions.pl that works correctly. Everything else is the same
# Loop trying to match current question and next question my $i; for($i = 0; $i < scalar(@questions); $i++) {  my $answer2 = $answer; # !!!NEW!!!  my $j = $i + 1; # $j is the next element of the array  print qq/Checking \$questions[$i] and \$questions[$j] \n/;  print qq/Matching element $i\n/;  if ($answer2 =~ m/$questions[$i]/g) {   print qq(Matched $questions[$i]\n);  } else {   print qq(Did NOT match $questions[$i]\n);   }    if ($questions[$j]) {   if ($answer2 =~ m/\Q$questions[$j]\E/g) {    print qq(Matched $questions[$j]\n);   } else {    print qq(Did NOT match $questions[$j]\n);    }  }    print qq(-----------------------------\n); # Dividing line }

This version yields the correct results.
$VAR1 = '1. What is your name?'; $VAR2 = '2. Please list your address and home phone number.'; $VAR3 = '3. What is your mother\'s maiden name (name before marriage)? +'; $VAR4 = '4. Name your four favorite bands or artists:'; ---------- BEGIN TEST ---------- TEST: $answer matched 1. What is your name? TEST: $answer matched 2. Please list your address and home phone numbe +r. TEST: $answer matched 3. What is your mother's maiden name (name befor +e marriage)? TEST: $answer matched 4. Name your four favorite bands or artists: ---------- END TEST ---------- Checking $questions[0] and $questions[1] Matching element 0 Matched 1. What is your name? Matched 2. Please list your address and home phone number. ----------------------------- Checking $questions[1] and $questions[2] Matching element 1 Matched 2. Please list your address and home phone number. Matched 3. What is your mother's maiden name (name before marriage)? ----------------------------- Checking $questions[2] and $questions[3] Matching element 2 Matched 3. What is your mother's maiden name (name before marriage)? Matched 4. Name your four favorite bands or artists: ----------------------------- Checking $questions[3] and $questions[4] Matching element 3 Matched 4. Name your four favorite bands or artists: -----------------------------

The question remains... In the original code, why does the value of $answer get corrupted or misread in the for loop, but works fine in the foreach loop? Why do I have to restate the original value of $answer for this to work?
Rich36
There's more than one way to screw it up...

Replies are listed 'Best First'.
(tye)Re: Matching inconsistency in for loop
by tye (Sage) on Dec 13, 2001 at 00:55 UTC

    You are using m//g in a scalar context which means that each match starts off where the previous match ended. That is, each match is only going to succeed if either the previous match failed or if the requested match happens later in the string than the previous match.

    This starting (and previous ending) position can be queried via pos($answer), and can even be set via pos($answer)= 0;

    You should probably just drop the "g"s.

    Also, you didn't always use \Q and \E which could also cause you problems (though it didn't appear to in this case).

            - tye (but my friends call me "Tye")
      Thanks very much. That definitely helps me understand that. I've fixed the code so that I don't have to redefine the variable.
      The next thing I'm having difficulty with is figuring out how to match what's inbetween the questions.
      I'm trying to get
      $answer =~ /$questions[$i](.*)$questions[$j]/; print qq(The answer is $1);
      or some variation of that to work(/g, /gm, etc.), but I'm not having any luck. Would somebody be able to explain how I could get the answers out of the question set?
      Thanks, Rich
      Rich36
      There's more than one way to screw it up...

        if( $answer =~ /\Q$questions[$i]\E(.*?)\Q$questions[$j]\E/s ) { print qq(The answer is $1\n); }
        /m controls whether ^ and $ can match around newlines in the middle of the string. You want /s which controls whether . can match newlines.

                - tye (but my friends call me "Tye")