Motomo94 has asked for the wisdom of the Perl Monks concerning the following question:

I have Transaction logs I am searching a subject. If I find it I want to return all the required data pertaining to that message from the log. Exchange will have 3 entries for one email (the message was received on the server, moved to its Que and lastly delivered to the recipient. I ONLY need one record entry in my output per message. I figured I would use the Message ID column # 10 (this should be unique). The other thing I would like to do is I have lets say 30 log files to search how can I do a search in each file once without having to list each file (the file names do change) Thank you for your time and help!

Here is the code
#!/usr/bin/perl -w use strict; use Text::CSV_XS; use IO::File; my $filename = "P:\\Program Files\\Exchsrvr\\DRACO.log\\20081 +116.log"; my $column_to_search = 18; my $wanted_value = 'Fw: Hey Ugly line expansion and re-offer'; my $csv = Text::CSV_XS->new ({sep_char => "\t"}); my $fh = IO::File->new($filename) or die $!; while (my $cols = $csv->getline($fh)) { last unless @$cols; next unless defined $cols->[$column_to_search] and $cols->[$column_to_search] eq $wanted_value; for (0,1,7) { $cols->[$_] = '' unless defined $cols->[$_]; } print join(' ',$cols->[0],$cols->[1],$cols->[7],$cols->[9],$cols-> +[13],$cols->[18],$cols->[19]),"\n"; }
And here is a sample of the data
DATA
# Message Tracking Log File # Exchange System Attendant Version 6.5.7226.0 # Date Time client-ip Client-hostname Partner-Name Serv +er-hostname server-IP Recipient-Address Event-ID MSGID + Priority Recipient-Report-Status total-bytes Number-Recipie +nts Origination-Time Encryption service-Version Linked-MS +GID Message-Subject Sender-Address 2005-9-10 0:0:16 GMT - - - storming - Someoneg@ao +l.com 1027 2433A69xxxxxxxxxxxxxxxx795006DB02DADF78@storming.Dom +ain.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - + c=US;a= ;p=AMSCAN;l=storming-050910000016Z-212788 Fw: Hey Ugly l +ine expansion and re-offer EX:/O=org/OU=Site/CN=RECIPIENTS/CN=Ause +r - 2005-9-10 0:0:16 GMT - - - storming - c1r3ai4g@ao +l.com 1019 2433A690xxxxxxxxxxxxxxxx5006DB02DADF78@storming.Doma +in.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - + - Fw: Hey Ugly line expansion and re-offer - - 2005-9-10 0:0:16 GMT - - - storming - c1r3ai4g@ao +l.com 1025 2433A6xxxxxxxxxxxxxxxx95006DB02DADF78@storming.Domai +n.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - +- Fw: Hey Ugly line expansion and re-offer - - 2005-9-10 0:0:16 GMT - - - storming - c1r3ai4g@ao +l.com 1024 2433A690Fxxxxxxxxxxxxxxxx6795006DB02DADF78@storming. +Domain.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 +- - Fw: Hey Ugly line expansion and re-offer - - 2005-9-10 0:0:17 GMT - - - storming - c1r3ai4g@ao +l.com 1033 2433Axxxxxxxxxxxxxxxx428E5EE4C6795006DB02DADF78@stor +ming.Domain.name1 0 0 11927 1 2005-9-10 0:0:16 GMT +0 - - Fw: Hey Ugly line expansion and re-offer Auser@Doma +in.name - 2005-9-10 0:0:17 GMT - - - storming - c1r3ai4g@ao +l.com 1020 2433A69xxxxxxxxxxxxxxxx95006DB02DADF78@storming.Doma +in.name1 0 0 11927 1 2005-9-10 0:0:16 GMT 0 - + - Fw: Hey Ugly line expansion and re-offer Auser@Domain.name + -

Replies are listed 'Best First'.
Re: Search tab / delimited but do not display duplicate / triplicate entries
by kennethk (Abbot) on Dec 15, 2008 at 23:01 UTC

    If you provide a data sample (good), try to make sure that it should return some values with your test code -> you can't match 'RTVs' if all the subject headers are 'Fw: Hey Ugly line expansion and re-offer'.

    As was mentioned on CB earlier, try creating a hash to avoid duplicates. Specifically, with an array of arrays named @array,

    %test_hash = (); for (@array) { if (exists ($test_hash{$_->[10]})) {next} print @{$_}; $test_hash{$_->[10]} = 1; }

    Update:As per an OP request, here's melding the above code with his posted code.

    my %test_hash = (); while (my $cols = $csv->getline($fh)) { last unless @$cols; next unless defined $cols->[$column_to_search] and $cols->[$column_to_search] eq $wanted_value; if (exists ($test_hash{$cols->[10]})) {next} $test_hash{$cols->[10]} = 1; for (0,1,7) { $cols->[$_] = '' unless defined $cols->[$_]; } print join(' ',$cols->[0],$cols->[1],$cols->[7],$cols->[9],$cols-> +[13],$cols->[18],$cols->[19]),"\n"; }