Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need a little jumpstart here - kind of stuck at the moment. Maybe there are a couple modules you can suggest? I'm familiar with List::Compare and Array::Compare.

Problem:
File 1 format: 9-digit-number year string
File 2 format: 9-digit-number year

So, file 1 may look like:

123456789 1900 foobar
123456789 1901 fooba
123456788 2000 foob
(notice the possible repetition of the 9-digit-number) file 2 looks similar, minus the string.

What I need is to compare the 9 digit number+year combination between the two files by running through file 1 and comparing against file 2. If file 2 does not have a match for the line in file 1, spit out the line grabbed from file 1 (INCLUDING the string) into a separate file.

Maybe I'm making this too hard for myself, but so far I figure something like:

open all files grab all lines from file 1 - split each line into 3 strings( $number, +$year, $string) concatenate $number.$year into one $searchvalue make hash with keys of $searchvalue and values of $string populate array1 with these $searchvalues grab all lines from file 2 into array2 compare array1 to array2 where value from array1 doesn't find a match in file2, print hash valu +e with the key of $searchvalue to file results
this close? any tips, suggestions? Thanks!

Replies are listed 'Best First'.
Re: pseudocode needed - comparing multiple columns of two files
by citromatik (Curate) on Dec 12, 2007 at 12:48 UTC

    Well, maybe the solution is a bit simpler:

    open all files grab all lines from file 2 - split each line into 2 strings( $number, +$year) concatenate $number.$year into one $searchvalue make hash with keys of $searchvalue and values=1 for each line in file1: split each line into 3 strings( $number2,$year2,$string2 ) concatenate $number2.$year2 into one $searchvalue2 look for that key in the hash if it is there, spit the line

    Hope this helps

    citromatik
Re: pseudocode needed - comparing multiple columns of two files
by Erez (Priest) on Dec 12, 2007 at 13:25 UTC

    In a sense, you can follow each line and code it, with some changes:

    open all files - open one file, then the other.
    concatenate $number.$year into one $searchvalue following your example, you should concatenate a space between the two values.
    populate array1 with these $searchvalues - that's already done, keys %hash is your array.
    grab all lines from file 2 into array2 - also redundant. if you iterate over each line, and compare it, you don't need to load all lines into an array.

    compare array1 to array2, where value from array1 doesn't find a match in file2, print hash value with the key of $searchvalue to file results

    This, I think, is the main loop, if you're comparing values from file1 to file2, I think you should first go over file2, populate its content into an array, then go over file1, and  print $_."\n" unless grep ($_ eq $number.' '.$year, @array);

    Software speaks in tongues of man; I debug, therefore I code.

Re: pseudocode needed - comparing multiple columns of two files
by apl (Monsignor) on Dec 12, 2007 at 12:45 UTC
    Why don't you write it and try?