in reply to Extracting data from each line that matches a email address from a Log file (Tab delimited)

Text::CSV::Simple is a little too simple, without making it complicated, to do the job for large files. Below is version using Text::CSV_XS.

use strict; use warnings 'all'; use Text::CSV_XS; my $parser = Text::CSV_XS->new ({sep_char => "\t"}); while (<DATA>) { next if ! length $_; if (! $parser->parse ($_)) { warn "Error parsing: $_"; next; } my @columns = $parser->fields(); next if ! defined $columns[19]; print "$columns[0], $columns[1], $columns[7], $columns[18], $columns +[19]\n"; }
__DATA__ # Exchange System Attendant Version 6.5.7226.0 # Date Time client-ip Client-hostname Partner-Name Serv +er-hostname server-IP Recipient-Address Event-ID MSGID + Priority Recipient-Report-Status total-bytes Number-Recipie +nts Origination-Time Encryption service-Version Linked-MS +GID Message-Subject Sender-Address 2005-9-10 0:0:16 GMT - - - storming - Someoneg@ao +l.com 1027 2433A69xxxxxxxxxxxxxxxx795006DB02DADF78@storming.Dom +ain.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - + c=US;a= ;p=AMSCAN;l=storming-050910000016Z-212788 Fw: Hey Ugly l +ine expansion and re-offer EX:/O=org/OU=Site/CN=RECIPIENTS/CN=Ause +r - 2005-9-10 0:0:16 GMT - - - storming - c1r3ai4g@ao +l.com 1019 2433A690xxxxxxxxxxxxxxxx5006DB02DADF78@storming.Doma +in.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - + - Fw: Hey Ugly line expansion and re-offer - - 2005-9-10 0:0:16 GMT - - - storming - c1r3ai4g@ao +l.com 1025 2433A6xxxxxxxxxxxxxxxx95006DB02DADF78@storming.Domai +n.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - +- Fw: Hey Ugly line expansion and re-offer - - 2005-9-10 0:0:16 GMT - - - storming - c1r3ai4g@ao +l.com 1024 2433A690Fxxxxxxxxxxxxxxxx6795006DB02DADF78@storming. +Domain.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 +- - Fw: Hey Ugly line expansion and re-offer - - 2005-9-10 0:0:17 GMT - - - storming - c1r3ai4g@ao +l.com 1033 2433Axxxxxxxxxxxxxxxx428E5EE4C6795006DB02DADF78@stor +ming.Domain.name1 0 0 11927 1 2005-9-10 0:0:16 GMT +0 - - Fw: Hey Ugly line expansion and re-offer Auser@Doma +in.name - 2005-9-10 0:0:17 GMT - - - storming - c1r3ai4g@ao +l.com 1020 2433A69xxxxxxxxxxxxxxxx95006DB02DADF78@storming.Doma +in.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - + - Fw: Hey Ugly line expansion and re-offer Auser@Domain.name + -

Prints:

# Date, Time, Recipient-Address, Message-Subject, Sender-Address 2005-9-10, 0:0:16 GMT, Someoneg@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, EX:/O=org/OU=Site/CN=RECIPIENTS/CN=Auser 2005-9-10, 0:0:16 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, - 2005-9-10, 0:0:16 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, - 2005-9-10, 0:0:16 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, - 2005-9-10, 0:0:17 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, Auser@Domain.name 2005-9-10, 0:0:17 GMT, c1r3ai4g@aol.com, Fw: Hey Ugly line expansion a +nd re-offer, Auser@Domain.name
Update: s/CVS/CSV/g

Perl is Huffman encoded by design.
  • Comment on Re: Extracting data from each line that matches a email address from a Log file (Tab delimited)
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Extracting data from each line that matches a email address from a Log file (Tab delimited)
by Motomo94 (Novice) on Nov 01, 2005 at 16:15 UTC
    Here is how I modified the script to do what I need.
    use strict; use warnings 'all'; use Text::CSV_XS; open STDIN,"c:\\scripts\\20051030.log" or die $!; my $columns; open STDOUT, ">Answers.out" or die "can't redirect stdout"; my $parser = Text::CSV_XS->new ({sep_char => "\t"}); while (<STDIN>) { next if ! length $_; if (! $parser->parse ($_)) { warn "Error parsing: $_"; next; } my @columns = $parser->fields(); next if ! defined $columns[20]; print "$columns[0], $columns[1], $columns[7], $columns[18], $columns [19]\n"; }
    I get this in return... Use of uninitialized value in concatenation (.) or string at C:\scripts\Xsv.pl l ine 26, <STDIN> line 392. I still not know where I put in what I am searching for. and I need to search two fields "Recipient-Address, Sender-Address" If my search word "Auser" (case should not matter)is in either field the give me the 5 columns

      Add use Data::Dumper; and update your next if ! defined $columns[20]; line to (print Dumper (\@columns)), next if ! defined $columns[20]; to see what is actually in @columns.


      Perl is Huffman encoded by design.

      Are you setting the correct seperator character in Text::CSV_XS->new ({sep_char => "\t"});?


      Perl is Huffman encoded by design.

      What does the debugger tell you is undefined?

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      I think you want something like this:
      #!/usr/bin/perl -w use strict; use Text::CSV_XS; use IO::File; my $filename = 'hdi.csv'; my $column_to_search = 1; my $wanted_value = 'Sweden'; my $csv = Text::CSV_XS->new({binary=>1}); my $fh = IO::File->new($filename) or die $!; while (my $cols = $csv->getline($fh)) { last unless @$cols; next unless defined $cols->[$column_to_search] and $cols->[$column_to_search] eq $wanted_value; for (0,1,3) { $cols->[$_] = '' unless defined $cols->[$_]; } print join(' ',$cols->[0],$cols->[1],$cols->[3]),"\n"; }