I need a Contig generating script which generates a Contiguous sequence from a list of SubSequences.
The input to this is a list of sub sequences(strings) which are the broken down from an original long sequence. the input can also contain sequences which do not belong to the final sequence. There can be more than one contiguous sequence generated out of the input data.
The size of each sub sequence could be between 200 to 500 characters. There can be up to 5000 input sub sequences. the generated contig can go be more than 1000 characters.
Let us say we have two sequences

my $str1 = "AATAGCAATTGACAAT";
my $str2 = "CAATCGGAACCAGCAT";

i.e to find match for the right edge of $str1 (let us say last 4 characters) with left edge of $str12(let us say first 4 characters). The number of matched characters can vary but it should be a exact match, It should be maximum possible, and the maximum number can be up to the the size of the smaller among the two strings. The minimum characters to match can be set at the command line. The matched string in the above example is CAAT.
By matching these two edge sequences we get a concatnated contig having AATAGCAATTGACAATCGGAACCAGCAT.
similarly by taking up more sequences from the input of sub sequences the final contig needs to be generated.
Thanks for reading this.
braj

In reply to How do I Gererate Contigs out of a list of sequences? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.