alienhuman has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I've got a multi-line text file in the format:

"@QUERY:FOO" ...data... ...data... ...etc... "@ENDQUERY" "@QUERY:BAR" ...data... ...data... ...etc... "@ENDQUERY"
I want to go line-by-line through the file, and grab the data in between @QUERY and @ENDQUERY and do stuff to it (namely put it in a HoH).

Here's the code block in question:

while (<FH>) { if ($_ =~ /^\"\@QUERY:(.*)\"/) { my $query = $1; my $i = 1; while (<FH> !~ /^\"\@ENDQUERY\"/) { print $_; #$data{$query}->{$i}->{$_} unless ($_ eq ""); $i++; } } }
My question: why is $_ not getting the next line of FH in the 2nd while loop? The print statement I have there prints the first pattern matched (e.g. @QUERY:FOO, @QUERY:BAR) for each line of data enclosed by @QUERY/@ENDQUERY until it reaches @ENDQUERY.

Thanks for any help,

AH

----------
Using perl 5.6.1 unless otherwise noted. Apache 1.3.27 unless otherwise noted. Redhat 7.1 unless otherwise noted.

Replies are listed 'Best First'.
Re: $_ and nested while loops w/angle operator
by diotalevi (Canon) on Apr 05, 2004 at 20:00 UTC
    You didn't assign to $_ in the second loop. The form while(<FH>) is special cased to while( defined( $_ = readline *FH ) ). When you didn't write it that way the assignment wasn't done for you.
Re: $_ and nested while loops w/angle operator
by jdporter (Paladin) on Apr 05, 2004 at 20:08 UTC

    To expand upon what diotalevi said... You should do the pattern match inside the inner while loop, not in the conditional itself.

    Here's how I'd rewrite your code:

    while (<FH>) { if ( /^"\@QUERY:(.*)"/ ) { my $query = $1; while (<FH>) { last if /^"\@ENDQUERY"/; next if $_ eq ''; print $_; push @{$data{$query}}, $_; } } }

    Note that this makes a HoA, rather than a HoH. Since the indices of your inner hash were simply sequential integers, an array works just as well, if not better.

    PS - You don't need to backwhack quotes inside a /regex/.

    jdporter
    The 6th Rule of Perl Club is -- There is no Rule #6.

      (Continued.)

      My code, above, will create a Hash-of-Arrays in %data. When it's done, you can iterate over the data set using code like the following:

      for my $query( sort keys %data ) { # get all the lines of the query at once, into an array: my @query_lines = @{$data{$query}}; # or iterate over them: for my $qline ( @{$data{$query}} ) { # ... do something with the line. } # or, if you really care about the line numbers: for my $li ( 0 .. $#{$data{$query}} ) { my $qline = $data{$query}[$li]; print "query $query, line $li: $qline"; } }
      Of course, the possibilities are endless; so if you are having trouble working with the data once you've read it in, we'd be happy to help.

      jdporter
      The 6th Rule of Perl Club is -- There is no Rule #6.

        jdporter,

        Update: I should've keep reading in perlreftut! The answer is here for anyone else that is still learning how to use references. Thank you all for your help.


        Thanks for the further elaboration. For some reason, I am having a hard time understanding what is happening here:

        ----------
        Using perl 5.6.1 unless otherwise noted. Apache 1.3.27 unless otherwise noted. Redhat 7.1 unless otherwise noted.
Re: $_ and nested while loops w/angle operator
by pbeckingham (Parson) on Apr 05, 2004 at 20:50 UTC

    Or, that wonderful scalar range operator again...

    my %data = (); my $query; while (<FH>) { if (my $result = /^"\@QUERY:(.*?)"$/ .. /^"\@ENDQUERY"$/) { next if $result =~ /E/; if ($result == 1) { $query = $1; next; } push @{$data{$query}}, $_; } }

      My head nearly exploded looking over that. Could you just explain a little of the logic there? or point me to the needed resource?

      Update: After RTFM I learned about the .. operator. Which i had read before but seeing it in this context helped bring it to light. However i'm very confused by next if $result =~ /E/; Why do you care if it has a big E in it? or did I miss something else?


      ___________
      Eric Hodges
        The '..' and '...' range operators return a number with 'E0' appended to it when the second condition matches (indicating that you are at the End of your range). That way you can tell if it is the last line with /E/ or /E0/ (e.g. if you only want the stuff in between the start and end markers), and the return value is still valid as a number. See perlop.
Re: $_ and nested while loops w/angle operator
by runrig (Abbot) on Apr 05, 2004 at 20:56 UTC
    Why not use the range operator and make the issue moot:
    my $query; while (<FH>) { if (my $i = /^"\@QUERY:(.*)"/.../^"\@ENDQUERY"/) { $query = $1, next if $i == 1; next if $i =~ /E/; print; } }
      Won't work. You're trying to match a multiline string on one line of input.

      must... wake.. up

      cLive ;-)

Re: $_ and nested while loops w/angle operator
by matija (Priest) on Apr 05, 2004 at 20:17 UTC
    diotalevi explained the reason, let me just provide the corrected code:
    while (($_=<FH>) !~ /^\"\@ENDQUERY\"/) {
    (Note that this code will get into trouble if there is no @ENDQUERY at the end of the file...
      Easy to fix that.
      while ((($_=<FH>) !~ /^\"\@ENDQUERY\"/) or (eof FH)) {
      Update: I think you'd better write it like:
      while ((($_=<FH>) !~ /^\"\@ENDQUERY\"/) and (not eof FH)) {