in reply to Filter out an input file with a given waiver file, and output to a specific file

This should get you moving in a useful direction; it demonstrates the use of a Hash (which happens to also be a Hash Of Hashes (HoH), which you could research on your own).

CAVEAT: This is not production quality code. Its purpose is demonstration, not solution.

#!/usr/bin/perl use strict; use warnings; # Set up a hash to receive the information my %violationdata = (); # Read the violations file into the hash open my $violations, '<', 'violations.txt' or die; while (my $vline = <$violations>) { my ($vkey, $ukey, $credit, $debit, $balance, $notes) = split /\s+/, $vline; $violationdata{$vkey}{UKEY} = $ukey; $violationdata{$vkey}{CREDIT} = $credit; $violationdata{$vkey}{DEBIT} = $debit; $violationdata{$vkey}{BALANCE} = $balance; $violationdata{$vkey}{NOTES} = $notes; } close $violations; # Display the contents of the hash foreach my $okey (sort keys %violationdata) { print " $okey:\n"; foreach my $skey (sort keys %{$violationdata{$okey}}) { my $sval = $violationdata{$okey}{$skey}; print " $skey: $sval\n"; } } # Now process the waivers and update the hash open my $waivers, '<', 'waivers.txt' or die; close $waivers; # Now write out the hash to the updated violations file open my $updated, '>', 'updated.txt' or die; close $updated;

Results:

$ ./violwaiv.pl abcd123: BALANCE: -900.00 CREDIT: 100.00 DEBIT: 1000.00 NOTES: (VIOLATED) UKEY: klmn123 abcd124: BALANCE: -900.00 CREDIT: 100.00 DEBIT: 1000.00 NOTES: (VIOLATED) UKEY: klmn124

Replies are listed 'Best First'.
Re^2: Filter out an input file with a given waiver file, and output to a specific file
by Laurent_R (Canon) on Jul 12, 2017 at 12:42 UTC
    Hi dbander

    I would tend to do it the other way around: store the waivers into a hash (a simple hash) where the key is the common identifier between the two files, and then read the violations file line by line, changing the content of the line when the identifier is common between the two files and writing the line to the new files. This is often more efficient this way.

    But since we don't know about the size of the violations and waivers files, it actually may be that your solution is better (especially if there are many more waivers than violations, but that's seems rather unlikely).

      Hi Laurent_R

      Agreed, that would be an excellent production design consideration.