Your exon and intron regex was limiting the results if I understand your question correctly. I reworked the code to to:
foreach(<INFO1>) { # got rid of the '*' which is too greedy # I made the date matches specific to the # example input you provided in your post # you may need to adjust for more options # in the matches depending on your data # consistency if(/^DATE\s+(\d{2})-(\w{3})-(\d{4})/){ print OUT "DBACC\t $no\n"; print OUT "Date\t $1-$2-$3\n"; $no++; # made one conditional that gets both # exon and intron. Used a [] (character # class match) instead of the \d*- # The + after it allows for 1 or more # of a 0-9 , ';' or '-' } elsif(/\s+\/(intr|ex)on="([\d-;]+)"\n/) { # added a split on ';' in case you want # or need to do something with each one # seperated by a ';' my @values = split(/;/,$2); foreach (@values) { # needed to uppercase the matched prefix # based on your example output since # the match was on the lowercase prefix print OUT ucfirst($1) . "on\t \{Translation\%$_\}\n"; } # if you don't need to do the split just do this # print OUT ucfirst($1) . "on\t \{Translation\%$2\}\n"; } else { print OUT "line $counter\n"; } $counter++; }
There are several good nodes on regex in the tutorial section. See the gotcha one in particular.

In reply to Re: about regular expression by trs80
in thread about regular expression by agustina_s

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.