Wow, that's a beast of a results file. At the moment, you are not using regular expressions at all. Without going in detail through this massive text file I can't give you the exact regexes but what you need to do is write down (and maybe post here if you need more help) what exactly a new block starts and ends with. This will help you to identify a pattern. Say you have a file like this:

miRNA1a - results SOME DATA HERE ####### results end ######### miRNA2 - results ... and so on
Then you could do:
while (<>){ chomp; if (/^(miRNA\w+) - results/){ my $mirna_id = $1; while (<>){ chomp; last if /^####### results end #########/; # PARSE YOUR DATA HERE } } }
There are of course other ways of doing this but this should work for you. You just need to identify the patterns that signal the beginning and end of a block.

UPDATE

Actually, I think this may be even better for you: since you are probably parsing the same data for different microRNAs, all you need to do is keep track of which microRNA you are reading the data for at the moment. So, still using my above example text file, I would do:

my $current_mirna; while(<>){ chomp; $current_mirna = $1 if /^(miRNA\w+) - results/; die unless defined $current_mirna; # now you always have the miRNA ID of the current block # conitnue here to parse the actual data and insert into the # database for that specific miRNA }

In reply to Re: regular expression questions (from someone without experience) by tospo
in thread regular expression questions (from someone without experience) by gogoglou

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.