This appears to work in the way I interpret your description, though there is at least one ambiguity in there so I may have dwiw'd the wrong way.

#! perl -slw use strict; my $re_resgen = qr[(.)(,(?:\w+),(?:\w+),ResGen)]; while (<DATA>) { s[$re_resgen][my $t=$1; $t.='"' unless $t eq '"'; $t.$2]e; print; } =pod output c:\test>234040 001 GENE1="Rattus norvegicus serum and glucocorticoid-regulated kinase + (sgk) mRNA, complete cds",NM_019232,333,ResGen,ATP binding|pr otein serine/threonine kinase|protein amino acid phosphorylation,,,,29 +517 002 GENE2="ESTs, Weakly similar to putative serine/threonine protein k +inase MAK-V [M.musculus]",NM_144755,331,ResGen,,,,,246273 003 GENE3="Thiosulfate sulphurtransferase (rhodanese)",X56228,329,ResG +en,mitochondrion|sulfate transport| thiosulfate sulfurtransfer ase,,,,25274 004 GENE4="Spleen tyrosine kinase",NM_012758,327,ResGen,ATP binding|pr +otein tyrosine kinase|intracellular signaling cascade|protein amino acid phosphorylation,,,,25155 005 GENE5="Spleen kinase 24,NM_012758,,ResGen,ATP binding|protein tyro +sine kinase|intracellular signaling cascade|protein amino acid phosphorylation,,,,25155 =cut __DATA__ 001 GENE1="Rattus norvegicus serum and glucocorticoid-regulated kinase + (sgk) mRNA, complete cds,NM_019232,333,ResGen,ATP binding|protein se +rine/threonine kinase|protein amino acid phosphorylation,,,,29517 002 GENE2="ESTs, Weakly similar to putative serine/threonine protein k +inase MAK-V [M.musculus]",NM_144755,331,ResGen,,,,,246273 003 GENE3="Thiosulfate sulphurtransferase (rhodanese)",X56228,329,ResG +en,mitochondrion|sulfate transport| thiosulfate sulfurtransferase,,,, +25274 004 GENE4="Spleen tyrosine kinase,NM_012758,327,ResGen,ATP binding|pro +tein tyrosine kinase|intracellular signaling cascade|protein amino ac +id phosphorylation,,,,25155 005 GENE5="Spleen kinase 24,NM_012758,,ResGen,ATP binding|protein tyro +sine kinase|intracellular signaling cascade|protein amino acid phosph +orylation,,,,25155

Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.


In reply to Re: Need method to create Regular expression for known pattern in the middle of a line by BrowserUk
in thread Need method to create Regular expression for known pattern in the middle of a line by Ya

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.