davidinottawa has asked for the wisdom of the Perl Monks concerning the following question:

I can't see my error here Monks!!!

I just need to grep where data from the pattern file exists in the input file.

input.file looks like this :
David@domain.com|David@domain.com|J|ABBASS, DAVID JOHN|
Cory@domain.com|Cory@domain.com|E|ABBOTT, CORY J|
Tania@domain.com|Tania@domain.com|F|ABBOTT, TANIA LEE|
Geoffrey@domain.com|Geoffrey@domain.com|N|ABBOTT, GEOFFREY BRYAN|

pattern.file looks like this :
Randall@domain.com
David@domain.com
Rob@domain.com
Tania@domain.com

Script should tell me that Tania@ and David@ exist :

sub main { my $filename = "input.file"; open (my $valuesFile, '<', 'pattern.file') or die "Failed: $!\n"; while (<$valuesFile>) { push (@lines, $_); } open(INPUT, $filename) or die "Cannot open $filename"; while (<INPUT>) { ($userName,$emailAddress,$division,$fullName) = split(/\|/, $_); while (@lines) { my $pattern = pop @lines; $pattern=~s/\n//g; #print "pattern: ".$pattern."address: ".$emailAddress."\n"; if ($emailAddress =~ /$pattern/) { print $pattern . " exists in " . $emailAddress."\n"; } else { print $pattern . " no match " . $emailAddress."\n"; last; } } } close(INPUT); close $valuesFile; }

Replies are listed 'Best First'.
Re: Look for pattern in file. ????
by Corion (Patriarch) on Jan 19, 2016 at 17:00 UTC

    On each trip around the while( <INPUT> ) loop, you do:

    my $pattern = pop @lines;

    Which leaves @lines empty. But you never refill it for the next trip around the while( <INPUT> ) loop.

    Also see quotemeta if you want to do literal matches and match dot as dot instead of a wildcard.

Re: Look for pattern in file. ????
by ikegami (Patriarch) on Jan 19, 2016 at 17:07 UTC
    while (<INPUT>) { ($userName,$emailAddress,$division,$fullName) = split(/\|/, $_); while (@lines) { my $pattern = pop @lines; $pattern=~s/\n//g; if ($emailAddress =~ /$pattern/) { print $pattern . " exists in " . $emailAddress."\n"; } else { print $pattern . " no match " . $emailAddress."\n"; last; } }
    should be
    my $exists = 0; while (<INPUT>) { ($userName,$emailAddress,$division,$fullName) = split(/\|/, $_); while (@lines) { my $pattern = pop @lines; $pattern=~s/\n//g; if ($emailAddress =~ /$pattern/) { $exists = 1; last; } } if ($exists) { print "$emailAddress exists\n"; } else { print "$emailAddress doesn't exist"\n"; }
    because you only want to print if it doesn't exist after you've checked all of the patterns, not just the first.

    A hash would be make more sense, though.

    sub main { my $input_qfn = "input.file"; my $pattern_qfn = "pattern.file"; my %patterns; { open(my $pattern_fh, '<', $pattern_qfn) or die("Can't open pattern file \"$pattern_qfn\": $!\n"); while (<$pattern_fh>) { chomp; ++$patterns{$_}; } } open(my $input_fh, '<', $input_qfn) or die("Can't open input file \"$input_qfn\": $!\n"); while (<$input_fh>) { chomp; my ($userName, $emailAddress, $division, $fullName) = split /\|/; if ($patterns{$emailAddress}) { print "$emailAddress exists in the pattern file\n"; } else { print "$emailAddress doesn't exist in the pattern file\n"; } } }

    Output:

    David@domain.com exists in the pattern file Cory@domain.com doesn't exist in the pattern file Tania@domain.com exists in the pattern file Geoffrey@domain.com doesn't exist in the pattern file

    (Most people use four space indentations. One space is just not enough. You couldn't even tell that the indentations didn't line up correctly! As such, I doubled your indentation in my suggested solution.)

      Or maybe you want
      sub main { my $input_qfn = "input.file"; my $pattern_qfn = "pattern.file"; my %emailAddresses; { open(my $input_fh, '<', $input_qfn) or die("Can't open input file \"$input_qfn\": $!\n"); while (<$input_fh>) { chomp; my ($userName, $emailAddress, $division, $fullName) = split /\|/ +; ++$emailAddresses{$emailAddress}; } } open(my $pattern_fh, '<', $pattern_qfn) or die("Can't open pattern file \"$pattern_qfn\": $!\n"); while (<$pattern_fh>) { chomp; if ($emailAddresses{$_}) { print "$_ exists in the input file\n"; } else { print "$_ doesn't exist in the input file\n"; } } }
      Randall@domain.com doesn't exist in the input file David@domain.com exists in the input file Rob@domain.com doesn't exist in the input file Tania@domain.com exists in the input file
Re: Look for pattern in file. ????
by stevieb (Canon) on Jan 19, 2016 at 16:59 UTC

    I'd go a bit of a different way, and utilize grep:

    use warnings; use strict; my $pat_file = 'pat.txt'; my $file = 'in.txt'; open my $pat_fh, '<', $pat_file or die $!; my @patterns = <$pat_fh>; chomp @patterns; close $pat_fh; open my $fh, '<', $file or die $!; my @exists; while (my $line = <$fh>){ my ($ret) = grep { $line =~ /$_/ } @patterns; push @exists, $ret if $ret; } print "$_\n" for @exists;
Re: Look for pattern in file. ????
by kcott (Archbishop) on Jan 19, 2016 at 21:40 UTC

    G'day davidinottawa,

    I'd approach this in a number of different ways to what you've posted:

    • Your input file appears to be pipe-separated CSV. Use Text::CSV: that's already done most of the work for you.
    • Instead of attempting to hand-craft I/O error messages, use the autodie pragma. Hand-crafting messages is error-prone: your first only says why, not what; your second only says what, not why.
    • Use the 3-argument form of open with lexical filehandles. You did this for your first open; however, the second uses a very generic package variable (i.e. INPUT) which could well be used elsewhere in your code (textually distant from the code you're looking at) and is therefore error-prone.
    • I'd use a hash for storing your match patterns.

    Putting all that together (pm_1153098_match_csv_lines.pl):

    #!/usr/bin/env perl -l use strict; use warnings; use autodie; use Text::CSV; my $input_file = 'pm_1153098_match_csv_lines_input.csv'; my $pattern_file = 'pm_1153098_match_csv_lines_pattern.txt'; my %match_pattern; open my $pat_fh, '<', $pattern_file; while (<$pat_fh>) { chomp; ++$match_pattern{$_}; } close $pat_fh; my $csv = Text::CSV::->new({sep_char => '|'}); open my $in_fh, '<', $input_file; while (my $row = $csv->getline($in_fh)) { print $row->[1] if $match_pattern{$row->[1]}; } close $in_fh;

    With these files:

    $ cat pm_1153098_match_csv_lines_input.csv David@domain.com|David@domain.com|J|ABBASS, DAVID JOHN| Cory@domain.com|Cory@domain.com|E|ABBOTT, CORY J| Tania@domain.com|Tania@domain.com|F|ABBOTT, TANIA LEE| Geoffrey@domain.com|Geoffrey@domain.com|N|ABBOTT, GEOFFREY BRYAN|
    $ cat pm_1153098_match_csv_lines_pattern.txt Randall@domain.com David@domain.com Rob@domain.com Tania@domain.com

    I get this output:

    $ pm_1153098_match_csv_lines.pl David@domain.com Tania@domain.com

    — Ken

Re: Look for pattern in file. ????
by davidinottawa (Initiate) on Jan 19, 2016 at 17:59 UTC
    Hi guys - thank so much for the input, such a quick turn around! It's greatly appreciated. PerlMonks rocks!