I find your code very hard to understand. I would spend more time on the indenting style. Many Monks love the old style indenting. I prefer the newer style for new code although I will "go with the flow for old code". You are writing new code, so I would go with the newer indenting style. Both are "correct", but whatever style you choose (old vs new), do it right according to that style.

Also, when you present code that doesn't do what you want, the more clearly you explain what it should be doing the better!

When using the 2 or 3 dot operator, keep things simple and finish capturing the complete record, then process it - don't try to "back up" in the middle of the if statement. Perhaps setting a flag "hey this record is of interest" would be fine. My point is: Do more complex things only if there is a performance reason. The first objective should be simplicity and clarity.

Usually some combination of regex and split is going to work out to be more flexible, easier to write and easier to understand and maintain than using substr(). Substr will be the fastest, but that does not necessarily mean "best". I've got code with a solid 1-2 pages of substr but I needed it for the performance.

Below, I used a regex that looks for lines with some number at the beginning and 0.00 at the end. The number at the beginning could be some huge number like what you have although I didn't see the need. Adjust to your requirements. Note that "space characters" include \t\b\r\n\s so there is no need to "chomp" the line.

As another piece of unsolicited advice..try to write code that is "flat", meaning that: fewer levels of indention == better. Think about how to reformulate things when you get the 4th level of indentation.

#!/usr/bin/perl -w use strict; my @data=(); while (<DATA>) { if (my $flag_EOR = /NAME /.../ADJ TO TOTALS:/) { push (@data, $_); #accumulates this record's data # add print "$flag_EOR\n"; to see what is happening... next unless $flag_EOR =~ /E0$/; } #print header/trailer and only the zero lines if (my @lines = grep{/^\d+.*\s*0\.00\s*$/}@data) { print $data[0]; # header of record print @lines; # lines that start with numbers and # end with 0.00 print $data[-1]; # trailer of record } @data=(); } =prints I manually chopped lines down to prevent word wrap NAME DOE, JOHN HIC 1111111111 ...blah... 12351141821118 111809 23 001 71010 ... 0.00 CO-18 31.00 0.00 12351141821118 111809 23 001 74150 ... 0.00 CO-18 199.00 0.00 12351141821118 111809 23 001 72192 ... 0.00 CO-18 182.00 0.00 ADJ TO TOTALS: PREV PD INTEREST 0.00 LATE FILING CHARGE 0.00 NET + 84.25 =cut __DATA__ REND PROV SERV DATE POS NOS PROC MODS BILLED ALLOWED + DEDUCT COINS GRP/RC AMT PROV PD ______________________________________________________________________ +__________________________________________________________ NAME DOE, JOHN HIC 1111111111 ACNT 1111111 + ICN 1111111111111 ASG Y MOA MA01 MA18 12351141821118 111809 23 001 71010 26 31.00 0.00 + 0.00 0.00 CO-18 31.00 0.00 + N347 12351141821118 111809 23 001 70450 26 142.00 44.70 + 0.00 8.94 OA-45 97.30 35.76 + N265 + PR-2 8.94 12351141821118 111809 23 001 74150 26 199.00 0.00 + 0.00 0.00 CO-18 199.00 0.00 + N347 12351141821118 111809 23 001 72192 26 182.00 0.00 + 0.00 0.00 CO-18 182.00 0.00 + N347 12351141821118 111809 23 001 72131 26 195.00 60.61 + 0.00 12.12 OA-45 134.39 48.49 + N265 + PR-2 12.12 PT RESP 21.06 CLAIM TOTALS 749.00 105.31 + 0.00 21.06 643.69 84.25 ADJ TO TOTALS: PREV PD INTEREST 0.00 LATE +FILING CHARGE 0.00 NET 84.25 CLAIM INFORMATION FORWARDED TO : XXXXXX XXXXXXXX INSURANCE CO

In reply to Re: Regex Not Grabbing Everything by Marshall
in thread Regex Not Grabbing Everything by JonDepp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.