This, in my experience, is a common error for someone parsing a log for the first time so don't feel bad. :-) I prefer to parse logs based on predictable components. The more wild the potential format, the more complicated the code gets, but for a relatively simple format like the one you are suggesting, I think it's fairly straightforward (assuming you have a basic understanding of Regular Expressions).
You have to craft your Regular Expression to match the data you are expecting. A technique I have become fond of is the use of an ifstatement, which provides the additional feature of filtering out lines that don't match my preconceived format. I often capture those out to another file for occasional review to see if the parsing routine needs to compensate for previously unknown formats or conditions. I won't do that in this example so we can save space.
C:\Steve\Dev\PerlMonks\P-2013-10-27@0838-Log-Parse>type test1.log GOOD Acme Toy Company 2010-01-01 2011-12-31 BAD XYZZY 1972-01-01 1972-06-18 UGLY Enron 2001-10-01 2011-09-11 C:\Steve\Dev\PerlMonks\P-2013-10-27@0838-Log-Parse>parselog.pl test1.l +og
| Status | Company Name | Start Date | End Date |
|---|---|---|---|
| GOOD | Acme Toy Company | 2010-01-01 | 2011-12-31 |
| BAD | XYZZY | 1972-01-01 | 1972-06-18 |
| UGLY | Enron | 2001-10-01 | 2011-09-11 |
#!/usr/bin/perl use strict; use warnings; # --------------------------------------------------------------- # Parse log with following format: # Status Company Name Start Date End Date # # Assumptions: Status contains no whitespace # Dates are in YYYY-MM-DD format # Company names have nothing that looks like a date # --------------------------------------------------------------- foreach my $inpfnm (@ARGV) { if (!open INPFIL, '<', $inpfnm) { print "ERROR: Cannot open input file '$inpfnm'\n"; } else { print "<HTML>\n"; print "<BODY>\n"; print "<TABLE BORDER>\n"; print " <TR>\n"; print " <TH>Status</TH>\n"; print " <TH>Company Name</TH>\n"; print " <TH>Start Date</TH>\n"; print " <TH>End Date</TH>\n"; print " </TR>\n"; while (my $inpbuf = <INPFIL>) { chomp $inpbuf; if ($inpbuf =~ /^(\w+)\s+(.+)\s+(\d{4}\-\d{2}\-\d{2})\s+(\ +d{4}\-\d{2}\-\d{2})\s*$/) { my $inpsts = $1; my $inpnam = $2; my $stadat = $3; my $enddat = $4; print " <TR>\n"; print " <TD>$inpsts</TD>\n"; print " <TD>$inpnam</TD>\n"; print " <TD>$stadat</TD>\n"; print " <TD>$enddat</TD>\n"; print " </TR>\n"; } } close INPFIL; print "</TABLE>\n"; print "</BODY>\n"; print "</HTML>\n"; } } exit; __END__
In reply to Re: Parsing Text from a File to HTML Table
by marinersk
in thread Parsing Text from a File to HTML Table
by anupchandu
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |