To Perl Gurus,

I posted a question a couple of weeks ago and it was somewhat misunderstood. I will do better to clarify. Thanks Abigail-II and Kristofer for your helpful responses.
I need to find the fastest way of searching one file against the other. The first file is a series of lines containing scalers. e.g.
ATGGCTCGTGTCCA
ATGGCTCGATGGCTCGCCC
ETC...

The second file is a very large file of DNA sequences ("random text"). The scripts needs to take each line of the first file and search it against the entire contents of the other file. Matches will occur throughout the file.
e.g. Take the first line; "ATGGCTCGTGTCCA". It could match within a string of text that looks like AAAAAAAA"ATGGCTCGTGTCCA"AAAAAAAAAAA etc... (as you can see all matches will be embedded with the text file). When there are two or more matches then the matched scaler will be printed out to a file (this is the easy part of course). I have come to find that Regex is slow as well as loading the files into arrays. In passing someone suggested to me to load the files into hashes first. I am not savvy enough to know if this will work before embarking on my novice ways.

I hope this is clear and thank-you in advance for helping me.

Dr.J


In reply to Quickest method for matching by dr_jgbn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.