Hello Monks!
Perl newbie here, and I am faced with the following problem:
I have 2 files with likes like the following:
FILE1
ID1:6qq5_A|14~~6qq5_B|14~~6qq6_A|14~~6qq6_B|14~~6t6v_A|14
ID2:7d5p_A|14~~7d5q_A|14
FILE2
ID1:6qq5_A|15~~6qq5_B|15~~6qq6_A|14~~6qq6_B|15~~6t6v_A|14
ID2:7d5p_A|14~~7d5q_A|12
basically, common ids, like ID1 and ID2 and then, next to them, a series of strings like
6t6v_A|14 separated by
~~. Now, what I need to do is, check, for each of the IDS (in this example ID1 and ID2), which of these smaller strings are the same. In my example, the desired output would be:
ID1:6t6v_A|14
ID1:6qq6_A|14
ID2:7d5p_A|14
because these are the only commons strings for each of the two IDs. What I am planning to do is create an structure like AoA and compare each of the elements, sequentially, to see which ones are the same. Is there maybe a faster way to do this task? Any pointers/help would be greatly appreciated!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.