If there is
that much duplication, and
that many columns,
- write a short script to split the one file (containing the two columns) into two files( A1, A2) of one column each
- use your native native sort utility on A1 and A2 to produce two files (B1, B2) containing only unique data
- use your native diff utility to do the comparison between B1 and B2
Never recreate what others have already done for you.