I posted on here once before, a question about REGEX stuff and you guys were fantastically helpful. I have eventually gotten so that I can write (probably very ugly) but effective scripts for getting the info that I need, so first off thanks to you all. My problem now is that I want to grab multiple lines of data, which occurs repeatedly in the file. For example:
0AVERAGE COMPOSITION IN PINS. NUMBER DENSITIES IN 1.0E+24/CM3, WT% P +ER MASS INITIAL HEAVY ISOTOPES. ---------------------------- FOR BA-ELEMENTS WITH EID>99100, WT% IS +THE PERCENTAGE LEFT (FRACTION). 0 EID: Cm-243 ND : 8.7352E-08 0 0 3822 0 3278 0 0 3260 0 +++ 0 3242 0 +++ +++ 0 3157 0 0 0 +++ 0 3096 0 0 0 +++ +++ 0 3170 0 0 0 0 0 0 0 3772 3170 3096 3157 3242 3260 3278 3822* 0 0 0 0 0 0 0 0 0 0 1GE 12 Bundle VOID=0% + >> PHOENUT /1.2.8 / << CORE MASTER 9 COMPOS CASE= 1 RP= 5 V= 2.9 CO= 0 B= 3307 + 2007-01-30 13.38.50 Page 668 Job0000 0AVERAGE COMPOSITION IN PINS. NUMBER DENSITIES IN 1.0E+24/CM3, WT% P +ER MASS INITIAL HEAVY ISOTOPES. ---------------------------- FOR BA-ELEMENTS WITH EID>99100, WT% IS +THE PERCENTAGE LEFT (FRACTION). 0 EID: Pu-238 ND : 7.0913E-06 1 1 3667 0 3283 0 0 3266 0 +++ 0 3250 0 +++ +++ 0 3192 0 0 0 +++ 0 3151 0 0 0 +++ +++ 0 3204 0 0 0 0 0 0 1 3630 3204 3151 3192 3250 3266 3283 3667* 1 1 0 0 0 0 0 0 1 1 1GE 12 Bundle VOID=0% + >> PHOENUT /1.2.8 / << CORE MASTER 9 COMPOS CASE= 1 RP= 5 V= 2.9 CO= 0 B= 3307 + 2007-01-30 13.38.50 Page 669 Job0000
In this example I want to grab all the info about Pu-238, where information for many other elements occurs before and after Pu-238. In addition, there are multiple statepoints throughout the file, therefore multiple occurances of Pu-238. I know that Pu-238 (or whatever isotope I want to search for) is a unique identifier, it's just grabbing all the numerical data, in the format already in the file, that is my problem. I started some code, which is attached below, but it is definitely not complete since I wasn't sure what the best way to grab multiple lines and then return it to an output file is. Any suggestions? Thanks!
#!/usr/local/bin/perl -w use IO::File; my $file = IO::File->new; print "Enter the output file you would like to analyze: "; chomp ($filename = <STDIN>); print "Enter the isotope you want to extract (ex: Am-241): "; chomp ($iso= <STDIN>); $file->open("< $filename") or die("Can't read the source:$!"); open(OUT, ">Comp_$filename"); select (OUT); @iso=(); until ($file->eof) { my $line = $file->getline(); if($line =~ /"$iso"/) { $line = $file->getline(); chomp($line); @col1 = split(qr/\s+/s, $line); push(@iso,"$col1[1] $col[2] $col[3]"); $line = $file->getline(); chomp($line); @col1 = split(qr/\s+/s, $line); push(@iso,"$col1[1]"); #I INTENDED TO DO THIS SAME PROCESS OVER AND OVER UNTIL THE FINAL LINE + WAS PROCESSED, THEN LET THE REGEX SEARCH FOR THE NEXT INSTANCE OF WH +ATEVER IS DESIRED } } # end of until #for($i=1; $i<=28; $i++){ # print "UNSURE WHAT THE BEST WAY TO PRINT IN ORDER IS"; #} close(OUT);
In looking at how I'm approaching it, I feel there must be a better way to grab multiple lines and save it in the form it's already in to access later, but unsure how to do this, or if some other approach would work well. Also, does using a regex this way work (meaning trying to input a variable into it, as in the form =~ /"$iso"/)? Any help would be appreciated ... thanks!

In reply to REGEX on multiple lines by igotlongestname

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.