Ok, finally found some time to look back at this. After doing some testing and closer examination and thinking about what's really happening, here's my thoughts and your fixed code.

   I tried your code, but the output file was empty.

Not sure what's happening there. I have now tested this on 2 systems and it seems to be working for me. Please note that I downloaded your sample data into a file named data.txt and hard-coded that into my code. If you didn't do that, then you might encounter some issues. Anyways, I think you can forget about that code. See below.

   ...but does not print up to the ADJ TO TOTALS part of the regex. Is there a reson why this part of the code would truncate the regex?

That forced to me actually try running your code (after, of course, cleaning up the alignment). Then as I was trying to figure out why the "ADJ TO TOTALS" line was not printed, I realized that the "NAME" line should not have been printed either, but it was. Sooooo, I pulled out one of the best tools for debugging regexes --- the print statement. I started tossing in print statements to figure out what the heck was in the variables to figure out if the problem was with the regex or what was going into the regex. And the answer is.. (insert drum roll)...neither. I know. You're thinking "Huh? What? What did he say?". Follow along.

First, look at your code. Where is the only print statement printing to the output file? It's inside of the if ($zero == "0.00") statement. If you look at the "NAME" and "ADJ" lines, $zero is not getting "0.00" so neither line should be printing. So what is actually being sent to the output file? That would be @data. I first changed that to $data and presto! The "NAME" and "ADJ" lines were not printed. Then I realized that you had a line where your were trying to reinitialize @data. The problem was you didn't do it in the right spot. You were only reinitializing it when the line had "1235114182", which is why the "NAME" line was printed. Changing the print statement back to using the @data and relocating @data=(); also worked.

In the end, there's two things that I did to find the issue. First, clean up the indenting so that I can quickly and easily understand what's inside of what brackets and braces. Second, debug with print statements. So why the long, convoluted response? To help illustrate the thought process that goes on in debugging. Sometimes that more helpful that saying "here's the problem and here's the solution". In other words, I thought that walking you through the debug process would be more useful to you than just handing you the "steps" to "fix" your code.

Two more quick points before I share the modified version of your code that I ran. First, I would agree with jaffy that your logic behind the looping and variable use is somewhat confusing, which made it difficult to understand what's going on and where the problem was at. Second, if you really wanted the "NAME" and "ADJ" lines printed, then the real problem is that you've discovered that perl is extremely good at doing what you told it to do instead of what you wanted it to do, which I, speaking from personal experience, admit can be very frustrating. In other words, your corrected code told perl to not print those lines.

Ok, the cleaned up and modified code below is what I ran to debug your code. Try running it and take a look at all of the stuff that gets printed to the screen. You'll see how that was useful in telling me what was going on.

use strict; use warnings; print "What file do you want parsed? "; my $file=<STDIN>; my @data; my $data; my $lines; open (TEST,"$file") or die$!; open OUTPUT, "> peptest.txt" or die$!; while (<TEST>) { if (/NAME /../ADJ TO TOTALS:/) { push @data, $_; foreach $data (@data) { print "data -- $data\n"; ## Added for debugging if ($data =~ /1235114182/) { $lines.=$_; my $zero = substr $lines, 118, 5; print "zero -- $zero\n"; if ($zero == "0.00") { #Version1 print OUTPUT "$data \n"; print OUTPUT "@data \n"; print " data sent to output file\n"; ## Added for debugging } else {print " data skipped\n"} ## Added for debugging $zero=""; $lines=""; $data=""; #Version1 @data=(); } @data=(); print "--------end of one iteration of foreach loop-----------\n\n +"; ## Added for debugging } } } close TEST; close OUTPUT;

In reply to Re^3: Regex Not Grabbing Everything by dasgar
in thread Regex Not Grabbing Everything by JonDepp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.