bratwiz has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a program and am having a problem with a file I opened being prematurely eof'd... going out of scope somehow perhaps?? Not sure exactly what's going on. Am prepared to admit / accept that its something I don't know or understand properly and be educated. Can anybody shed any light as to why this little program prematurely reaches end-of-file??? It gets through one iteration and returns the proper amount of data. The second go-around it says its at the end-of-file when there is clearly more data available (the actual dataset I'm using is almost 8MB so it definitely ain't lack of data!) I've stared at it for quite a while and reluctantly admit I just don't see it... ???
#! /bin/perl -w ## test to see when/how a file goes out of scope use strict; my $file = 'test.data'; my $frame_marker = 'data'; open(FILE_FD, $file) or die($!); foreach my $loop (1..10) { my $data = get_replay_data(); print "DATA READ:\n", map{ chomp; " VALUE='$_'\n" } @$data; } sub get_replay_data { # read from replay file and return a block of da +ta my $found = 0; my @data = (); die("ERROR -- File handle has closed (why??): $!") if (eof(FIL +E_FD)); foreach my $line (<FILE_FD>) { # pull records from replay file ## wait for begin-marker to capture data if ($line =~ /<$frame_marker>/i) { # wait for start-of +-frame marker $found++; push @data, $line; next; } if (!$found) { next; } ## wait for end-marker & return data if ($line =~ /<\/$frame_marker>/i) { push @data, $line; return \@data; # return xml data if end-of-fra +me marker found } push @data, $line; # otherwise grab frame data } return \@data; } __END__
--- Some sample data to work with:
<data> <string>1 some stuff</string> <string>1 some more stuff</string> <string>1 yet more stuff</string> <string>1 enough stuff</string> </data> <data> <string>2 some stuff</string> <string>2 some more stuff</string> <string>2 yet more stuff</string> <string>2 enough stuff</string> </data> <data> <string>3 some stuff</string> <string>3 some more stuff</string> <string>3 yet more stuff</string> <string>3 enough stuff</string> </data>

Replies are listed 'Best First'.
Re: Premature End-of-File - Scope problems?
by liverpole (Monsignor) on Feb 15, 2007 at 00:03 UTC
    Hi bratwiz,

    It's because this line:

    foreach my $line (<FILE_FD>) { # pull records from replay file

    is reading every line from the file into memory.

    You should read only as far as you need to each time.

    For example:

    while (my $line = <FILE_FD>) { # just read a line at a time

    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
      DUH! (looking sheepish now) You're absolutely right. And the dumbest bit of all is I _do_ use that construct nearly 99.999% of the time and for whatever reason just didn't this time. I love this site. It teaches me humility :) Thanks
Re: Premature End-of-File - Scope problems?
by GrandFather (Saint) on Feb 15, 2007 at 00:25 UTC

    Your data looks rather like it may be XML, in which case you may be better to look at some of the XML modules such as XML::Twig and XML::TreeBuilder. As an example of how these things can help consider:

    #! /bin/perl -w use strict; use warnings; use XML::TreeBuilder; my $root = XML::TreeBuilder->new (); $root->parse (do {local $/; <DATA>;}); my $frame_marker = 'data'; my @dataNodes = $root->look_down ('_tag', 'data'); my $nodeCount; for my $nodeIndex (0 .. @dataNodes - 1) { my @strings = $dataNodes[$nodeIndex]->look_down ('_tag', 'string') +; print "Data node " . ($nodeIndex + 1) . "\n"; print " ", $_->as_text (), "\n" for @strings; } __DATA__ <root> <data> <string>1 some stuff</string> <string>1 some more stuff</string> <string>1 yet more stuff</string> <string>1 enough stuff</string> </data> <data> <string>2 some stuff</string> <string>2 some more stuff</string> <string>2 yet more stuff</string> <string>2 enough stuff</string> </data> <data> <string>3 some stuff</string> <string>3 some more stuff</string> <string>3 yet more stuff</string> <string>3 enough stuff</string> </data> </root>

    Prints:

    Data node 1 1 some stuff 1 some more stuff 1 yet more stuff 1 enough stuff Data node 2 2 some stuff 2 some more stuff 2 yet more stuff 2 enough stuff Data node 3 3 some stuff 3 some more stuff 3 yet more stuff 3 enough stuff

    DWIM is Perl's answer to Gödel
      Yes, it is XML. I do have that squared away. For my purposes I don't need a tree, just to gracefully flatten it out for use in a monitor. Wanted it to be XML so I'd be positioned to go to a more standards-based solution in the future (I'm programming at the point of a gun lately-- my boss wants it now now now-- actually that's not true, he wants it _yesterday_ :) So quick-n-dirty is the order today.

        "Quick and dirty" often turns to "slow and messy". For example, compare the two "test" scripts in this thread. ;)

        Using XML::TreeBuilder was "Quick and dirty". A XML::Twig solution is likely to be more appropriate for a scalable solution. Consider:

        #! /bin/perl -w use strict; use warnings; use XML::Twig; my $dataCount = 0; my $str = do {local $/; <DATA>}; my $t= XML::Twig->new (twig_roots => {data => \&data}); $t->parse ($str); sub data { my ($t, $data) = @_; ++$dataCount; print "Data node $dataCount\n"; my @strings = $data->descendants ('string'); print " ", $_->trimmed_text (), "\n" for @strings; } __DATA__

        using the same data as the previous sample orints:

        Data node 1 1 some stuff 1 some more stuff 1 yet more stuff 1 enough stuff Data node 2 2 some stuff 2 some more stuff 2 yet more stuff 2 enough stuff Data node 3 3 some stuff 3 some more stuff 3 yet more stuff 3 enough stuff

        Note that both samples cheat by wrapping a root element around the data elements to form a more compliant XML document.


        DWIM is Perl's answer to Gödel