funny results: unexpected order reversal

jspindler has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: funny results by YuckFoo (Abbot) on Feb 20, 2002 at 22:45 UTC
The code you have shared is too disjointed. You push @dupes, then @dupes is never used again. You reference %dupes, but where did it come from? Most important would be seeing the code that generates the output. YuckFoo	[reply]
Re: Re: funny results by jspindler (Initiate) on Feb 21, 2002 at 00:38 UTC
Ok here we go: I don't know where the +'s are coming from but they are not in the code...the red pluses that is... #!/usr/local/bin/perl #+---------------------------------------------+ #\| Configuration Options \| #+---------------------------------------------+ $dirname = "C:\\Working\\"; $mastertemp = "C:\\master.txt"; $dupefile = "C:\\dupes.txt"; $newmaster = "C:\\newmaster.txt"; main(); sub main{ dirlist(); extract_data(); output_results(); } sub dirlist{ + # Pull Directory Listing opendir(DIR, $dirname) or die "can't opendir $dirname: $!"; + # open the directory while (defined($file = readdir(DIR))) { + # read the directory next if $file =~ /^\.\.?$/; push(@dirlisting, $file); # + put dirlist in @dirlisting } } sub extract_data { + # Get list of archived files %lines; %dupez; foreach (@dirlisting){ + # $filename1 = $_; open(INPUTFILE, "$dirname" . "$filename1"); open(MASTERTEMP, ">>$mastertemp"); open(DUPES, ">>$dupefile"); while (<INPUTFILE>) { $custID = substr($_,0,9); $ccn = substr($_,22,16); $expy = substr($_,44,4); $type = substr($_,50,3); $amount = substr($_,58,6); # $amount =~ s/^\s+//; if (($custID lt 0) \|\| ($custID eq "Name ")){ }else{ dupeczech(); print MASTERTEMP "$custID"." "."$ccn"." "."$e +xpy"." "."$type"." "."$amount\n"; } } close(INPUTFILE); close(MASTERTEMP); close(DUPES); } } sub dupeczech { $lines{$ccn}++; #add one to the value if it hasn't been def +ined yet it sets it equal to 1 if($lines{$ccn} == 2){ #been seen before so add to dupes list push @dupes, $ccn; # print "$ccn $custID $lines{$ccn}\n"; # print DUPES "$custID"." "."$ccn"." "."$expy\n"; } } sub output_results { open(MASTERTEMP, "$mastertemp"); open(DUPES, ">>$dupefile"); open(NEWMASTER, ">>$newmaster"); while (<MASTERTEMP>) { $custID = substr($_,0,9); $ccn = substr($_,22,16); $expy = substr($_,44,4); $type = substr($_,50,3); $amount = substr($_,58,6); foreach (@dupes){ if ($ccn eq $_) { print DUPES "$custID"." "."$ccn"." "."$ex +py"." "."$type"." "."$amount\n"; $dupez{$ccn}++; } } if (!$dupez{$ccn}){ print NEWMASTER "$custID"." "."$ccn"." "."$ex +py"." "."$type"." "."$amount\n"; } } close(NEWMASTER); close(MASTERTEMP); close(DUPES); } [download]	[reply] [d/l]
Re: Re: Re: funny results by YuckFoo (Abbot) on Feb 21, 2002 at 11:20 UTC
jspindler, I see that $custID, $ccn, ... $amount are all set at the same time, are not changed, then output. Nothing here would account for fields of different records getting mixed up. I don't think this version of the program is responsible for the mix up. A wild guess at the problem: $mastertemp is being opened in append mode. Unless you delete the file before running the program, you are processing results from previous runs of the program. Perhaps an earlier version of the program made the swaps. As was suggested, you already have a hash of $cnn you have seen in %lines. Why iterate over @dupes to find duplicates, just check %lines before writing to DUPES. Some suggestions about your file opening. You are opening MASTERTEMP and DUPES inside the foreach (@dirlisting) loop, reopening them each time you process a file. Not necessary, open them once outside of the loop and only open in append mode if you really need to, otherwise just open for writing. Also very important is to always, always, check the return of the open(). Another goal you should work towards is to quit using global variables. Variables needed by functions should be passed to the functions. You will reap huge benefits by doing this in larger programs in you future. Again, I don't see the fields getting mixed up in this code. Run the program again starting with a new, empty, MASTERTEMP file and see if the problem persists. YuckFoo	[reply]
Re: Re: Re: funny results by buckaduck (Chaplain) on Feb 21, 2002 at 13:24 UTC
I agree that the error you're seeing shouldn't be caused by this code, and that you should check to make sure that you wipe out your old dupes file before generating new results. Check out the other comments YuckFoo made on how you deal with files. One other minor note, on style. Lines like this: `print NEWMASTER "$custID"." "."$ccn"." "."$expy"." "."$typ +e"." "."$amount\n";` [download] ...can be more cleanly written like this: `print NEWMASTER "$custID $ccn $expy $type $amount\n";` [download] Or even better, you could have used a format. buckaduck	[reply] [d/l] [select]
Re: funny results by trs80 (Priest) on Feb 21, 2002 at 00:41 UTC
Can't you just do: `# add the $cnn to an array to keep order push @good, $cnn if $lines{$cnn} ne '1'; # add a key to a hash so you don't have to # loop through the array each time $lines{$cnn} = 1;` [download] Then you don't have any dupes, you just have the cc numbers. I am 100% what you asking because it looks like you have much more going on then was presented. If I am way off please help clue me in.	[reply] [d/l]
Re: funny results by jspindler (Initiate) on Feb 21, 2002 at 01:18 UTC
No I have to have the dupes sent to a file so I can check them manually for errors.	[reply]
Re: funny results by YuckFoo (Abbot) on Feb 20, 2002 at 22:46 UTC
One problem I see is you are pushing $cnn onto array @dupes, then treating it as a hash when checking for duplicates. You should probably $dupes{$cnn}++ instead of push @dupes, $cnn. YuckFoo Originally posted as a Categorized Answer.	[reply]
Re: funny results by YuckFoo (Abbot) on Feb 20, 2002 at 22:47 UTC
One problem I see is you are pushing $cnn onto array @dupes, then treating it as a hash when checking for duplicates. You should probably $dupes{$cnn}++ instead of push @dupes, $cnn. YuckFoo Originally posted as a Categorized Answer.	[reply]