Basically there is nothing special in reading regexes from a file in contrast to using predefined ones.

It boils down to the question, how to represent the matches and the number of matches in an efficient way.

I created some files with simplified test input to concentrate on the problem: regexes.txt

ID1>>^a ID2>>h$ ID3>>b ID4>>[a-z]{9,10} ID5>>[ah]
lines.txt
id_A: abcdefg id_B: bcdefgh id_C: cdefghijk
Probably you will have to make changes to the "split"-Statements to match the format of your input.

I am storing the matches in a Hash that uses the regex-expressions as keys and array references of matches as values.

#!/usr/bin/perl -w use strict; use autodie; open(my $regexefile, "<", "regexes.txt"); my @regexes = <$regexefile>; chomp @regexes; my %regexes = map { split(/>>/, $_) } @regexes; my %matches; open(my $inputfile, "<", "lines.txt"); while (<$inputfile>) { while (my ($id, $regex) = each(%regexes)) { my (undef, $line) = split(/ /, $_); if ( $line =~ /$regex/) { if (! defined($matches{$regex})) { $matches{$regex} = []; } chomp $line; push($matches{$regex}, $line); } } } while (my ($regex, $matches) = each(%matches)) { if (!scalar @$matches) { next; } print "$regex: No of matches " . scalar @$matches . "\n"; foreach my $match (@$matches) { print "matched $match\n"; } }

Update: added autodie; warnings are already active because of -w.

In reply to Re: How do i use regexes in one file to match FASTA sequences in another file by lune
in thread How do i use regexes in one file to match FASTA sequences in another file by mind_frame

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.