thor has asked for the wisdom of the Perl Monks concerning the following question:
Greetings all,
I'm confused with a little piece of code I've written. I was asked to come up with a program that would take two or more files, and tell what action would be taken by the first that would not be taken by any of the others. The file's format is for each line, it is either a colon separated list, starts with an 'I' which means to include another file, comments begin with a #, and blank lines are ignored. So, I recursively parse these files (when I hit an 'I', I open that file and process it). What I find wierd is that the while(<$fh>) structure seems to be reading an extra line at the end of each file that is more than one level deep in the recursion. I "fixed" the problem by turning warnings off, but for curiosity's sake I would like to know what's going on. Any ideas? My code is below
#!/usr/local/perl/bin/perl -w
use strict;
my %hash;
my $cal = shift;
read_cal($cal, 0);
foreach my $file (@ARGV) {
read_cal($file, 1);
}
foreach my $key (sort keys %hash) {
print "$key\n" if $hash{$key} == 1;
}
sub read_cal {
my $cal = shift;
my $fh;
print "---------------$cal-----------------\n";
my $mode;
open($fh, $cal) or die;
while(<$fh>) {
chomp;
s/\s//g;
next if /^#/;
next if /^\s*$/;
next if $_ eq "";
if (/^I/) {
read_cal(substr($_,1), $mode);
}
print "|$_|\n";
my $schedule = ((split(':'))[-1]);
if (!$mode || exists($hash{$schedule}) ) {
$hash{$schedule}++;
}
}
}
thanks in advance, thor
Re: Recursive file opening: reading an undef at the end of file
by davido (Cardinal) on Dec 23, 2003 at 22:30 UTC
|
Let me walk through one peculiar segment of the code...
while ( <$fh> ) {
chomp; # Remove trailing newlines (a subset of \s)
s/\s//g; # Remove ALL whitespace.
next if /^#/; # Skip comments.
next if /^\s*$/; # Skip lines with space, or lines with nothing.
# This doesn't make sense because you
# already removed all space. But it
# works because you allow it to match
# 'nothing' too.
next if $_ eq "";# This line is never true, because
# you already skipped lines with nothing
# in the previous statement.
#......
In other words, much of what you're doing there is either redundant or otherwise peculiar.
Next, your "read_cal" function is being called with two arguments, but within the function, you're only using the first argument.
I didn't even look at what you're doing in the if (!$mode... code, because I don't see $mode being set.
Also, you're only printing out hash elements with a value equal to one, though it might be simpler to just check for the existance of a hash element.
Update: None of these are the central problem. Congrats Roger for finding the straw that breaks this camel's back. I found a lot of other issues, but you found the big one. I'd recommend considering reworking the code a bit based on the suggestions people have given in this thread though, as the result will be cleaner.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Yup...I had a lot of extraneous stuff in there because I was trying to get rid of the error. Originally, I had:
while(<$fh>) {
chomp;
next if /^#/;
next if /^\s*$/;
...
I'd like to think that I'm not a complete dolt...:-)
thor
| [reply] [Watch: Dir/Any] [d/l] |
Re: Recursive file opening: reading an undef at the end of file
by Roger (Parson) on Dec 23, 2003 at 23:06 UTC
|
In your original code, the value of $_ gets clobbered after the recursive call to read_cal. The work around is to save the input line into a temporary variable, instead of relying on the Perl built-in $_ variable which can get modifed due to side effects somewhere else in the script.
I have modified your script and I believe the following will work without problem -
#!/usr/local/perl/bin/perl -w
use strict;
my %hash;
my $cal = shift;
read_cal($cal, 0);
foreach my $file (@ARGV) {
read_cal($file, 1);
}
foreach my $key (sort keys %hash) {
print "$key\n" if $hash{$key} == 1;
}
sub read_cal {
my ($cal, $mode) = @_;
print "---------------$cal-----------------\n";
my $fh;
open($fh, $cal) or die;
while(my $line = <$fh>) {
chomp($line);
next if $line =~ /^\s*$/;
next if $line =~ /^#/;
$line =~ s/\s//g;
# print "before $cal, |$line|\n";
read_cal(substr($line,1), $mode) if $line =~ /^I/;
print "|$line|\n";
my $schedule = (split ':', $line)[-1];
if (!$mode || exists $hash{$schedule}) {
$hash{$schedule}++;
};
}
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: Recursive file opening: reading an undef at the end of file
by Roy Johnson (Monsignor) on Dec 23, 2003 at 19:36 UTC
|
You never assign $mode. What is it supposed to be for? You also never use the 2nd argument to read_cal
Your test for /^\s*$/ is redundant, because you've already removed all whitespace, and you test for an empty line.
I think your main problem is that you should have a next after the recursive call, or put the rest of the stuff in an else. You don't want to read a file and parse the line, do you?
The PerlMonk tr/// Advocate
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
As for mode not being set...while unfortunate, that wasn't causing the problem (though it did cause incorrect results). The next after the recursive call did it though. Sometimes a guy just needs a second set of eyes. Thanks!
thor
| [reply] [Watch: Dir/Any] |
Re: Recursive file opening: reading an undef at the end of file
by pg (Canon) on Dec 24, 2003 at 01:43 UTC
|
Not trying to answer your question, but where is your close? Once you are done with an opened file, just close it. Regardless what Perl might do for you towards those open files, it is a bad style not to close the files you opened. People might use those little things to judge whether you are professional.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
it is a bad style not to close the files you opened
A few things:
- This was a quick and dirty script. Wrote it in about 5 minutes. Were this a production level script, I'd have done things differently.
- As I was using scalars ($fh) instead of barewords (FH), and I was lexically scoping them with 'my', the files are closed when the scalar goes out of scope. I tend to use features of the language when available.
thor
| [reply] [Watch: Dir/Any] |
|
Point taken, it was quick and dirty. All the same, I consider it a good idea to have the habit of writing the close when you write the open. But then, too, I'm one of the those people that fastens the safety belt by habit, even if I'm going two blocks. So have a grain of salt with this post.
| [reply] [Watch: Dir/Any] |
Re: Recursive file opening: reading an undef at the end of file
by Anonymous Monk on Dec 24, 2003 at 13:11 UTC
|
if (/^I/) {
read_cal(substr($_,1), $mode);
} else {
print "|$_|\n";
}
rather than if (/^I/) {
read_cal(substr($_,1), $mode);
}
print "|$_|\n";
This is because your $_ variable has been altered by the recursion, so using it is no longer appropriate! If you really want to keep it, you'll have to use a named, local variable. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
|