Deicide has asked for the wisdom of the Perl Monks concerning the following question:
Ok I’m a beginner and need help with code and direction on how to handle this problem. I have 2 csv files. File1, the data that I need is in column D but it’s not the only data in column D. The data I need in each cell is in the format “first last(G123456)” the data I need from file2 , in Column A, is in the format “first.last(G786342)”. Ultimately the data I need is just the cells that contain a (G783212) from file2 that are not in file1. I can’t depend on a whole cell comparison because the data infront of the id number might be different. Each one of the columns might have lots of duplicates, reducing each before the comparison might save time, there’s a few thousand records, 3-4 maybe. I’ve tried reduce each column to unique entries, then combine them and reduce them again to rid all duplicates including originals leaving only entries that do not have duplicates but was hoping there was an easier way.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: 2 files
by haukex (Archbishop) on Dec 05, 2018 at 22:42 UTC | |
|
Re: 2 files
by 1nickt (Canon) on Dec 06, 2018 at 00:27 UTC | |
|
Re: 2 files
by Laurent_R (Canon) on Dec 05, 2018 at 23:50 UTC |