in reply to comparing lists

Compared with file I/O manipulating hashes is cheap. Something like this perhaps:

use warnings; use strict; open inFile, '<', "accts_to_exclude.txt" || die "Cannot open accts_to_ +exclude.txt"; my %excludeList = map {chomp; $_ => undef} <inFile>; close inFile; open inFile, '<', "list1.txt" || die "Cannot open list1.txt"; my %acctList = map {chomp; $_ => undef} <inFile>; close inFile; open inFile, '<', "list2.txt" || die "Cannot open list2.txt"; my %diffList = map {chomp; $_ => undef} <inFile>; close inFile; open inFile, '<', "list3.txt" || die "Cannot open list3.txt"; my %nameList = map {chomp; $_ => undef} <inFile>; close inFile; open orgAcctList, '<', "orig_accts.txt" || die "Cannot open orig_accts +.txt"; open outFile, '>', 'output.txt' || die "Cannot open output file output +.txt"; while (my $line = <orgAcctList>){ chomp $line; my ($filename, $state, $amt, $ttl, $account, $name, $invnum) = spl +it /\t/, $line; next if exists $excludeList{$account}; print outFile "$filename $state $amt $ttl $account $name $invnum\n +"; if (defined $acctList{$account}) { # do something (write this info to file1) next; } if (defined $diffList{$account}) { # do something (write this info to file2¡K) next; } if (defined $nameList{$name}) { # do something (write this info to file2) last; } }

DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: comparing lists
by Anonymous Monk on Feb 04, 2006 at 04:02 UTC
    Thank you both for your response. Graff, your tool is indeed very helpful for a lot of what we do. For this solution though, Grandfather's solution does seem a bit more relevant and as I am somewhat under time constraints to complete this script, it seemed more logical as I can add on to my existing code.
    I've made some slight changes to the code above ...but after spending numerous hours debugging the script, I still can't seem to get it work properly. I maybe missing some logic here...but when I run the script, it seems to output everything into nc_load.txt

    Here's the code:
    #!/usr/bin/perl -w $excludeaccts = "accts_to_exclude.txt"; open(EXCLUDELIST, $excludeaccts) || die ("Cannot open $excludeaccts"); #Load accounts-to-exclude into a hash table %exclusionlist = map {chomp; $_ => undef} <EXCLUDELIST>; close(EXCLUDELIST); $ncaccts = "nc_acct_list.txt"; open(NCLIST, $ncaccts) || die ("Cannot open $ncaccts"); %ncacctlist = map {chomp; $_ => undef} <NCLIST>; close(NCLIST); $ccaccts = "cc_acct_list.txt"; open(CCLIST, $ccaccts) || die ("Cannot open $ccaccts"); %ccacctlist = map {chomp; $_ => undef} <CCLIST>; close(CCLIST); $ccnames = "cc_name_list.txt"; open(CCNALIST, $ccnames) || die ("Cannot open $ccnames"); %ccnamelist = map {chomp; $_ => undef} <CCNALIST>; close(CCNALIST); $origaccts = "orig_accts_list.out"; open(ORIGACCTLIST, $origaccts) || die ("Cannot open $origaccts"); open($output, '>', 'output.txt' || die "Cannot open output file output +.txt"); open($ncoutput, '>','nc_load.txt' || "Cannot open output file nc_load. +txt"); open($ccoutput, '>', 'cc_load.txt' || "Cannot open output file cc_load +.txt"); while ($line = <ORIGACCTLIST>){ chomp $line; ($filename, $state, $amt, $ttl, $account, $name, $invnum) = split /\t/ +, $line; next if exists $exclusionlist{$account}; print $output "$filename\t$state\t$amt\t$ttl\t$account\t$name\t$invnum +\n"; if (defined $ncacctlist{$account}) { print $ncoutput "$filename\n"; next; } if (defined $ccacctlist{$account}) { print $ccoutput "$filename\n"; next; } if (defined $ccnamelist{$name}) { print $ccoutput "$filename\n"; last; } print $ncoutput "$filename\n"; ###(I added the above line such that anything that's not in $nccctlist +, $ccacctlist, $ccnamelist will be printed to nc_load.txt file If I r +un it without this line, nothing is being printed to nc_load.txt or c +c_load.txt)### }

    Any suggestions as to what I maybe doing wrong here? Thanks.

      I see you are using if (defined $xxx{$yyy}) where you probably intended if (exists $xxx{$yyy})

      In general it is useful to provide the fail reason for opens using $!:

      open(NCLIST, $ncaccts) || die ("Cannot open $ncaccts: $!");

      The line open($output, '>', 'output.txt' || die "Cannot open output file output.txt"); has a missing ) and (. It should be open($output, '>', 'output.txt') || die ("Cannot open output file output.txt: $!");

      You should always use the three parameter open. It makes the input explicit and in other contexts where the file name is provided by the user avoids malicious effects from a user putting > at the start of a file name.


      DWIM is Perl's answer to Gödel
        Thanks a lot! That cleared up many things. I had to make one other slight modification...in the while loop, I cannot use LAST, rather I want to use NEXT at the end also because I want to continue to check until the end of the file rather than break the loop as soon as the last requirement is met.

        while ($line = <ORIGACCTLIST>){ chomp $line; ($filename, $state, $amt, $ttl, $account, $name, $invnum) = split /\t/ +, $line; $name =~ s/\s+$//g; next if exists $exclusionlist{$account}; print $output "$filename\t$state\t$amt\t$ttl\t$account\t$name\t$invnum +\n"; if (exists $ncacctlist{$account}) { print $ncoutput "$filename\n"; next; } if (exists $ccacctlist{$account}) { print $ccoutput "$filename\n"; next; } if (exists $ccnamelist{$name}) { print $ccoutput "$filename\n"; next; } print $ncoutput "$filename\n"; next; }