Of course this makes a great deal of difference.
If you have more data to search for than data to search through, don't use the method I suggested.
Given the narrow scope of your problem, there are probably a lot of optimizations you could make.
- How often will you need to search for the same 200,000 sequences in different input?
- How often will you need to perform a search on the same 27,000 larger sequences?
- Do you know the probabality characteristics of your search? Can you expect many hits or very few?
- What about overlaps?
- Are any of your shorter search sequences present in any of your larger ones?
- Is character data the best representation? (As you only use four characters, you might want to look into using bit strings or something instead.)
If you don't expect to be doing this kind of search often, you might be better off just brute forcing it than trying to optimize it too much.
Good Luck!
-sauoq
"My two cents aren't worth a dime.";
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.