in reply to Problem with a regex?

The ^L is the FORM FEED control character. It's used to separate pages ("records") of the report.

You can probably split on the FORM FEED character rather than on the text of the report header. Better yet, don't slurp the entire "large report" into memory, but instead process each report page one at a time by setting $/ ($INPUT_RECORD_SEPARATOR) to the FORM FEED character "\f".

#!/usr/bin/perl use strict; use warnings; use autodie qw( open close ); use English qw( -no_match_vars ); # Report pages are separated by FORM FEED control characters local $INPUT_RECORD_SEPARATOR = "\f"; open my $report, '<', 'QISC001'; while (my $page = <$report>) { # Parse and transform each report page here... } close $report; exit 0;

Jim

UPDATE: You mentioned you're splitting the report into separate "stores." I presume this means you're carving the report into individual files, one per page. This script is untested, but it illustrates some general ideas you might find useful.

#!/usr/bin/perl use strict; use warnings; use autodie qw( open close ); use English qw( -no_match_vars ); @ARGV == 1 or die "Usage: perl $PROGRAM_NAME <report file>\n"; # Report pages are separated by FORM FEED control characters local $INPUT_RECORD_SEPARATOR = "\f"; my $report_file = shift @ARGV; open my $report_fh, '<', $report_file; while (my $page = <$report_fh>) { my ($page_number, $store_number, $post_date) = $page =~ m{ PAGE:\s+(\d+) .+? STORE:\s+(\d+) .+? POST\s+DATE:\s+(\d\d/\d\d/\d\d\d\d) }msx; # For example, 07/14/2011 => 20110714 $post_date =~ s{(\d\d)/(\d\d)/(\d\d\d\d)}{$3$1$2}; # For example, 20110714-001-001.rpt my $page_file = sprintf "%s-%03d-%03d.rpt", $post_date, $store_number, $page_number; open my $page_fh, '>', $page_file; print {$page_fh} $page; close $page_fh; } close $report_fh; exit 0;

Replies are listed 'Best First'.
Re^2: Problem with a regex?
by TStanley (Canon) on Jul 15, 2011 at 18:22 UTC
    This did the trick. Thanks for your help.

    TStanley
    --------
    People sleep peaceably in their beds at night only because rough men stand ready to do violence on their behalf. -- George Orwell