Hello Monks!
Here is the scenario of the file that I have to work with. Every week, I get copy of seven files one file for each day of the week. What I need to do is to combined the files into one file keeping them in order (the oldest file 'first', etc with the newest file last). I got this part working. The next piece that I have to do is to search for duplicate records, but with a twist.
For a lack of a better term the 'primary key' for the record is within the characters 9-13 in the row. So if any information is updated greater than the 13th character the record needs updated. But wait, it get's better. Say there was an update made on a Monday, then on a Friday to a record. When the file is combined, I need the newest inserted which would be the record inserted on Friday.
An example would be this:
This would be on line 10 so it would be something earlier in the week
542642 19779 SAMMYs 17TH ST
on line 1500 this would be listed
542642 19779 SAMMYs Sesame ST
So what I would like is the SAMMYs at 17th gone, and only have the listing at Sesame ST. Also it’s the 19779 would be what let's you know that it's the same store.
So here is where I’m at now. I searched through previous monk posts and found some really good stuff on finding and removing duplicate elements in an array.
http://www.perlmonks.org/?node_id=280484
Which got me to
http://perldoc.perl.org/perlfaq4.html#How-can-I-remove-duplicate-elements-from-a-list-or-array%3F
So here is what I did. I put it into an array, reversed the array (so that instead of oldest first, it was newest first) then did the search. Then I re-reversed the array putting the oldest first again.
open (FILE, file.txt' || die "can't open file \n\n $!");
@FileInfo=<$file_name>;
@newFile=reverse(@FileInfo);
my %seen = ();
my @unique = grep { ! $seen{ $_ }++ } @newFile;
@newFile=reverse(@unique);
print FILE (@newFile);
close (FILE);
My problem is that I can't find a good example that does a hash/grep based on a 'primary key'.
I think I'm close, but really need the help and assistance from more well versed monks than I to make this truly work right.
thanks!
Dave