Dear all,

My quest is simple, but I am getting into dark and deep waters.

I have 2 files. Both consist of gene sequence lists. One contains whole sequences (a sequence can have 16,000 characters (bases)). The second file contains a list of sequences each with 25 bases (characters). I am trying to find matches of the little sequences inside any part of the big sequences. The first file has about 600 lines. The second has 6680 lines. Each line in both files corresponds to a sequence.

Here is the part of my code that is chocking:

while(<F1>) { # this is the "Biiiiigggg Sequence" $targetseq = $_; print "$targetseq\n"; chomp($targetseq); open(F2,"<$file2") or die "Error opening $file2: $!"; while(<F2>) { # probe sequences (25 base) substr($_,0, 25); $probe = $_; chomp($probe); if ($targetseq=~ /.*$probe.*/) { *this is Line 29 $start = index($targetseq, $probe); push(@matchregion,$start); } } $indexes = @matchregion;

I get the following error:

Unmatched ( in regex; marked by <-- HERE in /.*AGCTCAAAACTCTCAAAGAGGAGG MMUS00S00000022 AJ242777 1427689_a_at "Mus musculus mRNA for ABINs, ( <-- HERE A20-binding inhibitor of NF-kappa B activation (small).".*/ at sequence_coverage.pl line 29, <F2> line 1073.

I am trying to match the second file string to any part of the first file string. I know I am doing something wrong. Any help will be humbly accepted. Thanks!


In reply to Opening files, comparing strings. Should be simple!? by PD

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.