I am trying to write a script to pull a variable amount of data, depending on REGEX matches, off an input file. I'm about 90% there, but having trouble since I'm losing a small amount of the data due to how I go about navigating the file. Here is a sample of the input file:
m8 92234.30c 0.0071 92235.30c 0.9300 92238.30c 0.06289 + 8016.30c 2.0 42000.30c 2.5 + c + c BeO(2.86) Axial Reflector TD=3.01 / 95%=2.86 + m9 4009.30c 0.5 8016.30c 0.5 + mt9 beo.01t + c + c BeO(?AllgenCalc) Radial Reflectpr TD=3.01 / 95%=2.86 + m10 4009.30c 0.5 8016.30c 0.5 + mt10 beo.01t + c + c He/Xe(.0218) (72/28) ~.55 mol/L at 300K,1.38MPa, 39.6 g/mol + m11 2004.30c 0.7 + 54124.30c 0.00027 54126.30c 0.00027 54128.30c 0.00576 + 54129.30c 0.07932 54130.30c 0.01224 54131.30c 0.06354 + 54132.30c 0.08067 54134.30c 0.03132 54136.30c 0.02661 + c + c Sodium(0.929) RoomTemp = .97 g/cc, at melt = .929 g/cc + c Liquid = .929 - .000244*(t-371) (t in K) Handbook Ch&Ph + m12 11023.30c 1.0 $ Na (.929 g/cc) frozen/voi +d c + c Lithium(.515) RoomTemp = .534 g/cc, at melt = .515 g/cc + c Liquid = .515 - .000101*(t-454) (t in K) Handbook Ch&Ph
The script I wrote wants to pull off the data that looks like "####.30c", any instance (such as 54132.30c or 11023.30c etc...) including the data that says things like "beo.01t" (on the mt## cards), but not include any other information (needs to ditch comment cards, including lines beginning with "c" or with a "$" in them). The script does all this currently, but the problem it has is that it skips data. For instance, all the mt cards get skipped. I believe the reason, is how I constructed the until loop, with the "$line = <$FILE>" line preceding, and inside of it. I just haven't been able to figure out how to bypass, or circumvent that (including moving file lines, I tried the Tie::File to move around, but quickly got lost). Any thoughts or suggestions? Here is the code I have now (which works as is, but skips some data). Thanks!
#!/usr/local/bin/perl use strict; use warnings; print "Enter the filename to analyze (we can hardwire this later): "; chomp ( my $filename = <STDIN> ); open my $FILE, '<', $filename or die "Can't read the source: $!"; open my $OUT, '>', "Space_Nukes_Rule_$filename" or die "Can't open out +put file: $!"; my $count=0; my ($i, $j, $k, $popindex, $array, $arraytemp); my (@array, @subarray, @arraytemp, @data); while ( my $line = <$FILE> ) { if ( $line =~ /^m\d+/ ) { @arraytemp = ( split qr/\$/s, $line ); #print "@arraytemp"; @array = ( split qr/\s+/s, $arraytemp[0] ); #print "@array\n"; $array=@array; for ( $i=1; $i<$array; $i=$i+2) { push @data, "$array[$i]\n"; } $line = <$FILE>; until ( $line =~ /^c/ or $line =~ /^mt?\d+/ ) { @arraytemp = ( split qr/\$/s, $line ); @array = ( split qr/\s+/s, $arraytemp[0] ); $array=@array; for ( $i=1; $i<$array; $i=$i+2) { push @data, "$array[$i]\n"; } $line = <$FILE>; } } } print "@data\n";
It should be noted that sometimes a card (m8 for example) will have multiple lines of data required, where the continued lines have no continuation character but are just typed. Other times, such as m9, the data is all on one line. I believe the problem occurs when two m## or mt## lines occur without any comment cards in between, then every other card is skipped.

In reply to Skipping data on file read by igotlongestname

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.