in reply to Common between two lists

A non perl solution:
intersection:
grep -w -f gene_list1 gene_list2
unique in the second list for e.g:
grep -v -f gene_list1 gene_list2
NOTE: If the file size are large, this may not work!

Replies are listed 'Best First'.
Re^2: Common between two lists
by aaron_baugher (Curate) on Dec 09, 2011 at 16:35 UTC

    That may work in this case, assuming the genes are all the same length. But it would also match if a line from file1 appeared a a substring of a line in file2, so a more general non-perl solution to "I want the lines two files have in common" is to use comm. My guess is that using comm on two sorted files is probably less resource-intensive than asking grep to turn an entire file into search strings, too.

    sort file1 >file1.sorted sort file2 >file2.sorted comm -12 file1.sorted file2.sorted >common.lines

    Aaron B.
    My Woefully Neglected Blog, where I occasionally mention Perl.