in reply to Extracting data from each line that matches a email address from a Log file (Tab delimited)
There are a number of ways you could go about this. The first, and least recommended is to use a regex something like this:
use strict; use warnings; #throw away first two lines <DATA>; <DATA>; while (<DATA>) { chomp; next if ! defined $_ or ! length $_; my ($date, $time, $recip, $subject, $send) = /\s*([^\t]+) #date \s+([^\t]+) #time (?:\t[^\t]+){5}\s+ #Skip 5 fields (\S+) #Recipient (?:\t[^\t]+){10}\s+ #Skip 10 fields ([^\t]*)\t #subject ([^\t]*) #Sender /x; print "$date, $time, $recip, $subject, $send\n"; }
which generates
2005-9-10, 0:0:16 GMT, Someoneg@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, EX:/O=org/OU=Site/CN=RECIPIENTS/CN=Auser 2005-9-10, 0:0:16 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, - 2005-9-10, 0:0:16 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, - 2005-9-10, 0:0:16 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, - 2005-9-10, 0:0:17 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, Auser@Domain.name 2005-9-10, 0:0:17 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, Auser@Domain.name
However using a regex for parsing csv data such as this can be very difficult due to managing quoted data. It is generally better to use something like Text::CSV::Simple.
Yet another option, if you are comfortable with databases, is to use DBI to wrap your file and manage it as a database.
|
|---|