I posted a question a couple of weeks ago and it was somewhat misunderstood. I will do better to clarify. Thanks Abigail-II and Kristofer for your helpful responses.
I need to find the fastest way of searching one file against the other. The first file is a series of lines containing scalers. e.g.
ATGGCTCGTGTCCA
ATGGCTCGATGGCTCGCCC
ETC...
The second file is a very large file of DNA sequences ("random text"). The scripts needs to take each line of the first file and search it against the entire contents of the other file. Matches will occur throughout the file.
e.g. Take the first line; "ATGGCTCGTGTCCA". It could match within a string of text that looks like AAAAAAAA"ATGGCTCGTGTCCA"AAAAAAAAAAA etc... (as you can see all matches will be embedded with the text file). When there are two or more matches then the matched scaler will be printed out to a file (this is the easy part of course). I have come to find that Regex is slow as well as loading the files into arrays. In passing someone suggested to me to load the files into hashes first. I am not savvy enough to know if this will work before embarking on my novice ways.
I hope this is clear and thank-you in advance for helping me.
Dr.J
In reply to Quickest method for matching by dr_jgbn
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |