Hi Monks,

I think this is a unique situation. I tried it with hash but I think because the keys of hash are unique I cannot implement if with hash. Then I switched to array of array which I think it will work. Obviously my code does not work but want to show one way that it can be implemented.

here is the problem and examples.

1- Read column one of File1

2- Split the string in that column to its components (gene name-startbase_endbase)

3- We will have genename, startbase, end base.

4- Put it into an array of array

5- Read column one of File2 6- Do the same thing as 2-4

7- Query through array of array of file1 and find common elements in file2 that:

a. First match genename if they matched then check if

b. Start position of the matched genename in file1 falls between start and end position of the same genename in file2

File1: CLS_S3_Contig2721-139_168 CLS_S3_Contig2722-375_390 CLS_S3_Contig2725-323_362 CLS_S3_Contig2725-455_480 CLS_S3_Contig2728-117_144 CLS_S3_Contig2728-437_472 CLS_S3_Contig2729-119_130 CLS_S3_Contig2729-163_220 CLS_S3_Contig2730-181_202 CLS_S3_Contig2730-361_384 CLS_S3_Contig2731-824_843 CLS_S3_Contig2731-1150_1201 CLS_S3_Contig2735-571_636 CLS_S3_Contig2735-677_710 CLS_S3_Contig2735-775_810 . . .
File2 CLS_S3_Contig2721-142_169 CLS_S3_Contig6525-509_514 CLS_S3_Contig6525-493_502 CLS_S3_Contig6525-503_508 CLS_S3_Contig2977-365_376 CLS_S3_Contig2977-77_82 CLS_S3_Contig2977-83_90 CLS_S3_Contig4978-271_274 CLS_S3_Contig4978-385_388 CLS_S3_Contig2730-365_389 . . .
Output: Genename(file1) start end ** Genename(file2) start end CLS_S3_Contig2721 139 168 ** CLS_S3_Contig2721 142 169 CLS_S3_Contig2730 361 384 ** CLS_S3_Contig2730 365 389 . .
while(<INPUT1>){ chomp; my @id = split /\t/; if ($id[0] =~ /(.+?)\-(\d+?)_(\d+)/) { my @line_map = ("$1", $2, $3); push @file_map, [@line_map]; } } close(INPUT1); while(<INPUT2>){ chomp; my @map_id = split /\t/; if ($tg_id[0] =~ /(.+?)\-(\d+?)_(\d+)/) { my @tg_id = ("$1", $2, $3); push @file_tg, [@tg_id]; } } if (($from_tg == $from_map) && ($to_tg == $to_map)){ print join("\t",$two_geno_id, $from_map,$to_map,"<-Ma +pside**TGside->",$two_geno_id, $from_tg, $to_tg, $from_map_tg_range, +$to_map_tg_range),"\n"; $lines_1++; } elsif (($from_tg < $to_map) && ($from_tg > $from_map)){ print join("\t",$two_geno_id, $from_map,$to_map,"<-Ma +pside**TGside->",$two_geno_id, $from_tg, $to_tg, $from_map_tg_range, +$to_map_tg_range),"\n"; $lines_9++; }

In reply to Qurey through array of array by sesemin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.