in reply to Why is this program producing blank lines when parsing a file
No doubt someone knows of a module for recognizing dates in log files, but I think it's a simple-enough issue that coding this part from scratch is just as easy.my @logentries = (); my $toss = 1; my $monthRegex = join("|",qw/Jan Feb Mar Apr May Jun Jul Aug Sep Oct N +ov Dec/); while (<LOGDATA>) { if(/^($monthRegex)\s+d{1,2}\s+(\d{2}:){3}/) { push(@logentries,$_); $toss = 0; # unless you want to ignore some of'em # in which case: $toss++; } else { $logentries[$#logentries] .= $_ unless $toss; } } foreach (@logentries) { # decide what to do with each entry, # handling XML content with a suitable module when necessary }
If the log is really big and you don't want an array eating up that much memory, this alternative while loop would work:
my $entry = ""; while (<LOGDATA>) { if(/^($monthRegex)\s+d{1,2}\s+(\d{2}:){3}/) { $result = &handleEntry( $entry ) if $entry; $entry = $_; # unless you want to ignore some of'em # in which case: $entry = ""; } else { $entry .= $_ if $entry; } } $result = &handleEntry( $entry ) if $entry; # and replace the "foreach" loop in the previous version # with "sub handleEntry { ... }"
update: fixed some commentary about setting $entry = ""
inside the while loop
update: added the $toss and fixed commentary in
the first example, and fixed the second example (again!) so
that extra lines from a "tossed" entry get ignored properly.
|
|---|