Hi, I'm new in perl and would be intrested to have suggestion from experienced people on how to write optimized code for the following task:
I work with genomic data filesize higher than 1Go and have 2 files with lines that looks like this:
** A line from file1:
HWI-EA332_91026_1_1_7_586#TCTTAT/1 + Chr3 67121130 TATTNTAAGTCTATGTTGGGGGGGTGGTCATTGAATGTAAGNTGGGTCTC
** A matching line from file2:
HWI-EA332_91026_1_1_7_586#ACATAA/2 - Chr7 127074854 AAAATAAAGCTNATCTGGAAGCAACAGTANGAAGCAGAAGACTGNACACC
The id is the subsring EA332_91026_1_1_6_683 identified by  my @token = split('\-|\#', $line);
What i want to do is:
for each id in file 1 look if same id exist in file 2 and when there's a match:
* I add the matching line from file 1 followed by the matching line from file 2 to a new file (matched_pair)
* based on a tab delimited split of the line take column 2/3 from the matching line in file1 and file2 and the absolute value of the difference between column 4 in file1/file2 (gap_dist) and the id.
Based on the example above we expect to have this line :
EA332_91026_1_1_7_586 + Chr3 - Chr7 59953724
Thanks in advance for your kind help.
Regards,
Ramzi

In reply to generating merged data from matching id in 2 files by ramouz87

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.