in reply to Parsing a Simple Log File

If you work with records as units rather than with lines things get a little easier:

use warnings; use strict; my @records; local $/ = "\n\n"; while (<DATA>) { my ($title, $row) = /TEST\s+\w+\W+(\w+)\D+(\d+)/; next unless defined $row; push @records, [$title, $row]; } print "$_->[0] -> $_->[1]\n" for @records; __DATA__ data per sample

Prints:

CUSTOMERPHONE -> 1300 CUSTOMERORDER -> 0 CUSTOMERCARE -> 530

Note that @records is used to preserve the order of the data in the source. The reversion to using a hash should be simple and obvious.


Perl reduces RSI - it saves typing

Replies are listed 'Best First'.
Re^2: Parsing a Simple Log File
by bichonfrise74 (Vicar) on Nov 05, 2008 at 00:39 UTC
    Your script looks very elegant as compared to mine. I didn't think of using $/ as a delimiter between each 'records'.
Re^2: Parsing a Simple Log File
by gone2015 (Deacon) on Nov 05, 2008 at 13:16 UTC

    Building on that base, I suggest...

    Unless you are absolutely sure that the input will always be in exactly the right form (now and in the future), it's a good idea to do something when the regex doesn't match -- at least a diagnostic message indicating that something isn't as expected. Along the lines of:

    if (my ($title, $row) = /TEST\s+\w+\W+(\w+)\D+(\d+)/) { push @records, [$title, $row]; } else { warn "no match at $." ; } ;
    Of course you could make the warning message more helpful, or do something else to flag the problem.

    In passing, when I checked this fragment I noted that the regex is "widely drafted" (as the lawyers would say). For real (as opposed to example) code it's a good idea to tighten up the regex, so that it doesn't happily provide duff results from duff data.