sheasbys has asked for the wisdom of the Perl Monks concerning the following question:
Thank you.use File::Copy; my($input_file1) = $ARGV[0]; my($input_file2) = $ARGV[1]; my($output_file) = $ARGV[2]; if ( !defined($input_file1) || !defined($input_file2) || !defined($out +put_file) ) { print "Error: usage: nodups input_file1 input_file2 output_file\n" +; } else { # -----Backup the input files in case of error----- copy( $input_file1, $input_file1 . ".bak" ) or die "Could not backup file 1 $input_file1 to $input_file1.bak: + $!\n"; copy( $input_file2, $input_file2 . ".bak" ) or die "Could not backup file 2 $input_file2 to $input_file2.bak: + $!\n"; # -----Attempt to open all of the files----- open( INFILE1, $input_file1 ) || die( "Could not read input file 1 + ($input_file1): $!" ); open( INFILE2, $input_file2 ) || die( "Could not read input file 2 + ($input_file2): $!" ); open( OUTPUT, "> " . $output_file ) || die( "Could not open output + file ($output_file): $!" ); # -----Read input_file2 into an array so that (later) we can do a +binary search----- @input2 = <INFILE2>; # -----Debug code. Add in if you are experiencing problems. Note t +hat his is used below to print----- # -----out the current line number----- # $linecount = 0; # $outputcount = 0; while (<INFILE1>) { my $line = $_; chomp($line); # -----A line starting with a '2' is a header and is left unch +anged if ( $line !~ m/^2/ ) { foreach $line2 (@input2) { $date = substr( $line, 6, 6 ); $number_dialed = substr( $line, 29, 10 ); $connect_time = substr( $line, 54, 12 ); if ( index( $line2, $date ) != -1 and index( $line2, $ +number_dialed ) != -1 and index( $line2, $connect_time ) != -1 ) { # -----Generate the output string----- $output_line = substr( $line, 0, 6 ) . $date . substr( $line, 12, 17 ) . $number_dialed . substr( $line, 39, 15 ) . $connect_time . substr( $line, 66, 144 ) . " +\n"; print OUTPUT $output_line; # -----Debug code. Add in if you are experiencing +problems----- # print STDOUT "Output " . ++$outputcount . "\n"; # -----If we have found the line, we want to exit +the loop----- last; } } # -----Debug code. Add in if you are experiencing problems +----- # print STDOUT "Line " . ++$linecount . "\n"; } else { print OUTPUT $line . "\n"; } } # -----Close all of the files----- close( INFILE1 ); close( INFILE2 ); close( OUTPUT ); }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: File Handling for Duplicate Records
by shmem (Chancellor) on Dec 21, 2006 at 22:39 UTC | |
|
Re: File Handling for Duplicate Records
by Thelonius (Priest) on Dec 22, 2006 at 11:27 UTC | |
by sgt (Deacon) on Dec 22, 2006 at 15:31 UTC |