Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Recursive file opening: reading an undef at the end of file

by thor (Priest)
on Dec 23, 2003 at 19:03 UTC ( [id://316683]=perlquestion: print w/replies, xml ) Need Help??

thor has asked for the wisdom of the Perl Monks concerning the following question:

Greetings all, I'm confused with a little piece of code I've written. I was asked to come up with a program that would take two or more files, and tell what action would be taken by the first that would not be taken by any of the others. The file's format is for each line, it is either a colon separated list, starts with an 'I' which means to include another file, comments begin with a #, and blank lines are ignored. So, I recursively parse these files (when I hit an 'I', I open that file and process it). What I find wierd is that the while(<$fh>) structure seems to be reading an extra line at the end of each file that is more than one level deep in the recursion. I "fixed" the problem by turning warnings off, but for curiosity's sake I would like to know what's going on. Any ideas? My code is below
#!/usr/local/perl/bin/perl -w use strict; my %hash; my $cal = shift; read_cal($cal, 0); foreach my $file (@ARGV) { read_cal($file, 1); } foreach my $key (sort keys %hash) { print "$key\n" if $hash{$key} == 1; } sub read_cal { my $cal = shift; my $fh; print "---------------$cal-----------------\n"; my $mode; open($fh, $cal) or die; while(<$fh>) { chomp; s/\s//g; next if /^#/; next if /^\s*$/; next if $_ eq ""; if (/^I/) { read_cal(substr($_,1), $mode); } print "|$_|\n"; my $schedule = ((split(':'))[-1]); if (!$mode || exists($hash{$schedule}) ) { $hash{$schedule}++; } } }

thanks in advance,
thor

Replies are listed 'Best First'.
Re: Recursive file opening: reading an undef at the end of file
by davido (Cardinal) on Dec 23, 2003 at 22:30 UTC
    Let me walk through one peculiar segment of the code...

    while ( <$fh> ) { chomp; # Remove trailing newlines (a subset of \s) s/\s//g; # Remove ALL whitespace. next if /^#/; # Skip comments. next if /^\s*$/; # Skip lines with space, or lines with nothing. # This doesn't make sense because you # already removed all space. But it # works because you allow it to match # 'nothing' too. next if $_ eq "";# This line is never true, because # you already skipped lines with nothing # in the previous statement. #......

    In other words, much of what you're doing there is either redundant or otherwise peculiar.

    Next, your "read_cal" function is being called with two arguments, but within the function, you're only using the first argument.

    I didn't even look at what you're doing in the if (!$mode... code, because I don't see $mode being set.

    Also, you're only printing out hash elements with a value equal to one, though it might be simpler to just check for the existance of a hash element.

    Update: None of these are the central problem. Congrats Roger for finding the straw that breaks this camel's back. I found a lot of other issues, but you found the big one. I'd recommend considering reworking the code a bit based on the suggestions people have given in this thread though, as the result will be cleaner.


    Dave

      Yup...I had a lot of extraneous stuff in there because I was trying to get rid of the error. Originally, I had:
      while(<$fh>) { chomp; next if /^#/; next if /^\s*$/; ...
      I'd like to think that I'm not a complete dolt...:-)

      thor

Re: Recursive file opening: reading an undef at the end of file
by Roger (Parson) on Dec 23, 2003 at 23:06 UTC
    In your original code, the value of $_ gets clobbered after the recursive call to read_cal. The work around is to save the input line into a temporary variable, instead of relying on the Perl built-in $_ variable which can get modifed due to side effects somewhere else in the script.

    I have modified your script and I believe the following will work without problem -
    #!/usr/local/perl/bin/perl -w use strict; my %hash; my $cal = shift; read_cal($cal, 0); foreach my $file (@ARGV) { read_cal($file, 1); } foreach my $key (sort keys %hash) { print "$key\n" if $hash{$key} == 1; } sub read_cal { my ($cal, $mode) = @_; print "---------------$cal-----------------\n"; my $fh; open($fh, $cal) or die; while(my $line = <$fh>) { chomp($line); next if $line =~ /^\s*$/; next if $line =~ /^#/; $line =~ s/\s//g; # print "before $cal, |$line|\n"; read_cal(substr($line,1), $mode) if $line =~ /^I/; print "|$line|\n"; my $schedule = (split ':', $line)[-1]; if (!$mode || exists $hash{$schedule}) { $hash{$schedule}++; }; } }
Re: Recursive file opening: reading an undef at the end of file
by Roy Johnson (Monsignor) on Dec 23, 2003 at 19:36 UTC
    You never assign $mode. What is it supposed to be for? You also never use the 2nd argument to read_cal

    Your test for /^\s*$/ is redundant, because you've already removed all whitespace, and you test for an empty line.

    I think your main problem is that you should have a next after the recursive call, or put the rest of the stuff in an else. You don't want to read a file and parse the line, do you?


    The PerlMonk tr/// Advocate
      As for mode not being set...while unfortunate, that wasn't causing the problem (though it did cause incorrect results). The next after the recursive call did it though. Sometimes a guy just needs a second set of eyes. Thanks!

      thor

Re: Recursive file opening: reading an undef at the end of file
by pg (Canon) on Dec 24, 2003 at 01:43 UTC

    Not trying to answer your question, but where is your close? Once you are done with an opened file, just close it. Regardless what Perl might do for you towards those open files, it is a bad style not to close the files you opened. People might use those little things to judge whether you are professional.

      Because a lexical filehandle was used, the file opened by each call to the subroutine will be closed when the that call to the subroutine returns.

      There may be a problem if you are expecting hundreds of nested calls, but otherwise there is nothing wrong here. Each file is closed immediately after reaching its end.

      it is a bad style not to close the files you opened
      A few things:
      1. This was a quick and dirty script. Wrote it in about 5 minutes. Were this a production level script, I'd have done things differently.
      2. As I was using scalars ($fh) instead of barewords (FH), and I was lexically scoping them with 'my', the files are closed when the scalar goes out of scope. I tend to use features of the language when available.

      thor

        Point taken, it was quick and dirty. All the same, I consider it a good idea to have the habit of writing the close when you write the open. But then, too, I'm one of the those people that fastens the safety belt by habit, even if I'm going two blocks. So have a grain of salt with this post.
Re: Recursive file opening: reading an undef at the end of file
by Anonymous Monk on Dec 24, 2003 at 13:11 UTC
    I suspect you mean
    if (/^I/) { read_cal(substr($_,1), $mode); } else { print "|$_|\n"; }
    rather than
    if (/^I/) { read_cal(substr($_,1), $mode); } print "|$_|\n";
    This is because your $_ variable has been altered by the recursion, so using it is no longer appropriate! If you really want to keep it, you'll have to use a named, local variable.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://316683]
Approved by sth
Front-paged by Roger
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2024-03-29 01:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found