Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: Regex Not Grabbing Everything

by japhy (Canon)
on Sep 17, 2010 at 14:05 UTC ( [id://860493] : note . print w/replies, xml ) Need Help??

in reply to Regex Not Grabbing Everything

Update: I need to take a closer look at your code.

Could you explain, a bit more abstractly, what it is you are attempting to do? You want the lines which have ' 0.00' at column 118, correct?
First, your bracing and indenting style leaves something to be desired. Here's how I'd write your code:
while (<TEST>) { if (/NAME /../ADJ TO TOTALS:/) { push @data, $_; foreach my $data (@data) { if ($data =~ /1235114182/) { $lines .= $_; my $zero = substr $lines, 118, 5; # <-- 5? or 4? # you had '==', you want 'eq' if ($zero eq "0.00") { print OUTPUT "@data \n"; } @data = (); $zero = $lines = ""; } } } }
You were using == which is for strictly numeric data, but you want to use eq because you're looking specifically for the characters '0.00'. You either want to change your substr() to 4 characters, or else look for ' 0.00', I think.

Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
Nos autem praedicamus Christum crucifixum (1 Cor. 1:23) - The Cross Reference (My Blog)

Replies are listed 'Best First'.
Re^2: Regex Not Grabbing Everything
by JonDepp (Novice) on Sep 17, 2010 at 14:41 UTC

    I made the changes you suggested. Changing the == to eq gave me back an empty output file and changing the double quotes to single quotes gives me the same output as before. I am extracting what I want from the file, but it seems to stop before the "ADJ TO TOTALS" part of the regex.

    There are 0.00 all over the file, so looking for just 0.00 will return almost everything from the original file. The 0.00 in that particular string at column 118 is where it is meaningful for me and if it is there I want the entire array from NAME to ADJ TO TOTALS. I can't see any reason why it would cut off the regex before the end.

    Thanks for your help!

      I also told you that the string '0.00' is only FOUR characters, and you were taking a substring of FIVE characters.

      Your code is written in a rather confusing manner. You've got code in loops that shouldn't be there. What you want to do is keep all the lines until you find one that matches your criteria, and then print the lines you've kept and all the lines following it. Here is a sample solution:
      # print all lines from 'START' to 'STOP' # if a line in between them has 'foobar' at position 10 my $target = 'foobar'; my $pos = 10; my (@buffer, $found); while (<FILE>) { if (/START/ .. /STOP/) { # if we have already found our target string, print this line if ($found) { print } # otherwise... else { # store this line in our buffer push @buffer, $_; # and if we find the target string at the right location # set $found to 1, and print the buffer if (substr($_, $pos, length($target)) eq $target) { $found = 1; print @buffer; } } } }

      Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
      Nos autem praedicamus Christum crucifixum (1 Cor. 1:23) - The Cross Reference (My Blog)

        There is a space for numbers that are five characters in the substring. I changed the 5 to 4 and got 0.0. I used your code and got an empty output file. The first IF statement couldn't have a found in there because the condition that makes the statement true is in the substring deeper in the code. My original code returns what I want in the order I need it but stops before the end of the regex. I commented out different parts of the code to see if the array I was pushing $_ to from the regex was capturing everything, and it is. The code prints out everything I need up until I check for the 0.00 condition. After that it prints out if the 0.00 condition is true but truncates the array before the "ADJ TO TOTALS" line. Is there a reason why testing for that substring condition would truncate the array?