in reply to Finding duplicate keys
and the other file looks like this:key1 key3 key4 key6 ...
then you might try downloading this script that I posted here under "Code/Utilities". To find the keys that are common to both files, the command line is:key2 key3 key4 key5 ...
To find the union:cmpcol -i file1 file2
To find the keys that are unique to file1 (or file2):cmpcol -u file1 file2
It also handles multi-column data (use "file1:4" to use column 4 of file1 as the key field in that file), and supports perl regex field delimiters (e.g. -d '[\s:;,.]+' to split each line of each file at any combination of whitespace and/or punctuation).cmpcol -x1 file1 file2 # what's uniq to file1 cmpcol -x2 file1 file2 # what's uniq to file2 cmpcol -x file1 file2 # all uniq items, tagged by source -- i.e.: key1 <1 key2 <2 key5 <2 key6 <1
It has more bells and whistles... I hope it helps.
|
|---|