Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hi, i'm new to perl....i have two files: file1=>
andrew 15 1.3 jason 18 1.2 john 23 2.1 patrick 21 2.2
and file2 =>
jeane 15 3.2 andrew 14 1.1 andrew 15 1.3
and i want to compare each name in the first column of file1 to all the names in file2, and if i find a match, compare the second columns of the people with the same name and if they are the same print it out, else go to the next name.......i'm new to perl, so anyone with any kind of tips on how to go about doing this will help a lot...thanks

Replies are listed 'Best First'.
Re: help with search and match
by John M. Dlugosz (Monsignor) on Aug 26, 2002 at 16:04 UTC
    Use split on each line of input. So, it will look something like this:
    while (my($name,$n1,$n2)=split(' ',<INFILE>) { # do something... }
    Note that split with a pattern of a single space, as shown above, has a special meaning. Read about split in perlfunc.

    Prep your file2 by reading the whole thing into an array.

    open FILE, "< file2.txt" or die; my @file2= <FILE>; # read the whole thing close (FILE);
    Now, use the grep function to check for matches against the array. Inside the "do something..." body of the loop above,
    foreach my $hit (grep /^$name\w+/, @file2) { my($name_2,$n1_2,$n2_2)=split(' ',$hit); # same as you did before. print "Found $hit\n" if $n1 == $n1_2; }
    That's untested, but should give you some ideas. This introduces some "Perl ways" without being too thick to understand for a novice.

    Good luck.

    —John

Re: help with search and match
by fruiture (Curate) on Aug 26, 2002 at 16:03 UTC

    One of the main reasons for Perl's power are it's hashes.

    #!/usr/bin/perl use strict; use warnings; @ARGV >= 2 or die "2 args mandatory"; my ($base,$compare) = @ARGV; # always use smaller file as 'base': #UPDATE: this makes only sense if # the name field is unique in both files! #- thanks to [John M. Dlugosz] ## ($base,$compare) = ($compare,$base) ## if -s $base > -s $compare; open my $basefh,'<',$base or die "opening '$base' failed: $!"; my %base = (); while(<$basefh>){chomp; my ($name,@columns) = split; $base{$name} = \@columns; } close $basefh; open my $cmpfh,'<',$compare or die "opening '$compare' failed: $!"; while(<$cmpfh>){chomp; my ($name,@columns) = split; next unless exists $base{$name}; print "$_ [@{$base{$name}}]\n" if $base{$name}[0] == $columns[0]; } close $cmpfh;
    --
    http://fruiture.de
      You're assuming the first field is unique. In his example for file2, they were not. That was also the "shorter" one.

        true. So if order matters, $base and $compare should not be switched. Anyway the first field must be unique in the $base file then, otherwise it wouldn't make much sense, so the code is still not that wrong. Thanks.

        --
        http://fruiture.de
Re: help with search and match
by Thelonius (Priest) on Aug 26, 2002 at 16:20 UTC
    My first choice would be to sort the two files and use the 'join' command instead of perl, e.g.:
    sort -o file1 file1 sort -o file2 file2 join file1 file2 | awk '$2 == $4'
Re: help with search and match (Homework?)
by talexb (Chancellor) on Aug 26, 2002 at 16:05 UTC
    This smells very strongly of homework.

    What have your tried? Show us your code.

    (Hint: Use a hash. Look it up in your Perl text, it should describe what it is and how it works.)

    --t. alex
    but my friends call me T.

      this is actually not school related....anyways, i tried this:
      open(FILE1, "$file1") || die "can't open file"; open(FILE2, "$file2") || die "can't open file"; while (<FILE2>) { chomp; @array2= split; while(<FILE1>) { chomp; @array1 = split; } if ($array1[0] eq $array2[0]) print "match\n"; if ($array1[1] == $array[2]) { print $array[1]; } else { print " no match"; } }
        I can see a number of problems with your code ..
        • The first time through the FILE2 loop, you'll read all of FILE1 and it will come up empty after that.
        • You are reading all of FILE1 for each line of FILE2 -- that's not efficient.
        • I would rename your array variables as array1 and array2 (corresponding, of course, to FILE1 and FILE2)
        Second to last, I would reformat the code as:
        open(FILE1, "$file1") || die "can't open file"; open(FILE2, "$file2") || die "can't open file"; while (<FILE2>) { chomp; @array2= split; while(<FILE1>) { chomp; @array1 = split; } if ($array1[0] eq $array2[0]) { print "match\n"; if ($array1[1] == $array[2]) { print $array[1]; } else { print " no match"; } } }
        Finally, check out the other good responses to the original question .. your approach should be abandoned.

        --t. alex
        but my friends call me T.

        Also, use strict; use warnings; to catch some of your typos (and declare your variables).
        This looks alot like a question from a few days ago. See my response in that thread.

        --

        There are 10 kinds of people -- those that understand binary, and those that don't.