Am I missing something when I interpret OP's spec, "I'd like to store everything starting from 'line=ULMNm' till before the next 'line=ULMNm' as one string", as meaning the sample data should be divided into elements, each with a single element begining with "line=" and ending with the first instance of two newlines?

Missing something or not, that's how I read it in writing this to satisfy my understanding of the spec:

#!/usr/bin/perl use strict; use warnings; # 864768 my @words = split /(line=)/, do { local $/="\n\n"; <DATA> }; # a v +ariant of moritz' advice for my $words(@words) { chomp $words; if ($words eq "line=") { print $words; }else{ print "$words \n -------\n"; # the dashes visually separa +te the output records } } exit; __DATA__ line=ULMNm 3 1fdy_07 N-ACETYLNEURAMINATE LYASE user + 1 3 RMSD = 1.06 A MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 0 +.7442 0.1108 -16.917 -91.429 -35.632 D 47 SER A 57 SER.? D 48 THR A 56 THR.? D 165 LYS A 33 LYS~? line=ULMNm 3 2tmd_00 TRIMETHYLAMINE DEHYDROGENASE user + 1 3 RMSD = 1.15 A MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 -0 +.8462 0.3266 52.913 23.262 25.449 A 169 TYR A 41 TYR~? A 172 HIS A 95 HIS^? A 267 ASP A 98 ASP~? line=ULMNm 3 4fdy_07 P-HYDROOXIDE user 1 +3 RMSD = 1.06 A MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 0 +.7442 0.1108 -16.917 -91.429 -35.632 D 47 SER A 57 SER.? D 48 THR A 56 THR.? D 165 PQR A 33 PRQ~? line=ULMNm 3 5tmd_00 BAZ Blivitz user 1 3 + RMSD = 1.15 A MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 -0 +.8462 0.3266 52.913 23.262 25.449 A 169 TYR A 41 TYR~? A 172 HIS A 95 HIS^? A 267 XYZ A 98 XYZ~?

and we see this, upon execution:

F:\_wo\pl_test>perl 864768.pl ------- line=ULMNm 3 1fdy_07 N-ACETYLNEURAMINATE LYASE user + 1 3 RMSD = 1.06 A MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 0 +.7442 0.1108 -16.917 -91.429 -35.632 D 47 SER A 57 SER.? D 48 THR A 56 THR.? D 165 LYS A 33 LYS~? ------- line=ULMNm 3 2tmd_00 TRIMETHYLAMINE DEHYDROGENASE user + 1 3 RMSD = 1.15 A MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 -0 +.8462 0.3266 52.913 23.262 25.449 A 169 TYR A 41 TYR~? A 172 HIS A 95 HIS^? A 267 ASP A 98 ASP~? ------- line=ULMNm 3 4fdy_07 P-HYDROOXIDE user 1 +3 RMSD = 1.06 A MATRIX: -0.3862 -0.2080 -0.8987 0.6457 0.6347 -0.4244 -0.6587 0 +.7442 0.1108 -16.917 -91.429 -35.632 D 47 SER A 57 SER.? D 48 THR A 56 THR.? D 165 PQR A 33 PRQ~? ------- line=ULMNm 3 5tmd_00 BAZ Blivitz user 1 3 RMSD = 1.15 A MATRIX: 0.9011 -0.4313 0.0445 -0.1032 -0.3130 -0.9441 -0.4211 -0 +.8462 0.3266 52.913 23.262 25.449 A 169 TYR A 41 TYR~? A 172 HIS A 95 HIS^? A 267 XYZ A 98 XYZ~? ------- F:\_wo\pl_test>

Note the empty record that is the first output. Not good... hence, I'd welcome comments on my algorithm/code AND any comments rebutting my interpretation of the spec.

Belated addition, 2125 EDT (U.S., roughly 10 hours later): Re OP's question about storing the munged data in variables. Whilst working this out, I used Data::Dumper to try to ascertain why an earlier iteration didn't work... and after fixing my foolishness but before removing D::D from the code, observed that D::D's list of vars had "line=" (see split at line 10) in Var2, Var4... and the rest of each munged data section in Var3, Var5, ....


In reply to Re: regex question: store multiple lines as a string by ww
in thread regex question: store multiple lines as a string by nurulnad

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.