Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hi monks I am having problem in parsing a file with another file...
First file contains the name, around 500 (only one column, text file) and another file which contains name as first column and second column contains its synonyms(may be one or two or three or four)....

now what I did, I made a hash of first table with key as name value as name itself. and i read the second file and tried to match the name column with the hash and wrote 2 files, one which matched and another which doesn't match........

the problem is I found that some of the names are matching to the synonyms which I was not able to match through script and were pooled out to not match list............. Is there a way so that when I start matching with hash, program checks the name in the synonyms also and when there is a match it returns the name as well as the synonyms.... here is my code:

#!/usr/bin/perl use strict; use warnings; my %hash; my($name,$val,@nam); my ($cnt1,$cnt2) = 0; open(WRITE1,">match_name"); open(WRITE2,">nomatch_name"); open(DATA1,"<name500") or die "Could not open the relevant file"; while(<DATA1>) { chomp; ($name,$val) = split(/\t/,$_); $hash{$name}=$name; } close(DATA1); open(DATA2,"name_syn") or die "Check file"; while(<DATA2>) { chomp $_; if($_=~/^#/) { next; } @nam = split(/\t/,$_); if (exists $hash{$nam[0]}) { print WRITE1 "$hash{$nam[0]}\t$nam[1]\n"; $cnt1 +=1; } else { print WRITE2 "$hash{$name}\n"; $cnt2 +=1; } } print "$cnt1\t$cnt2\n"; close(WRITE1); close(WRITE2);

when I executed this program I got 422 matches and 78 mismatches but when I looked through file 2 I found that some name matched to the synonym and so they were filtered to 'noname_match' file......how to modiy the code so that it matches the name or synonyms....

thanks...

Replies are listed 'Best First'.
Re: hash and matching synonyms
by Util (Priest) on Jun 03, 2007 at 03:18 UTC

    If I understand you correctly, you just need to change this line:

    if (exists $hash{$nam[0]})
    to this:
    if ( grep { exists $hash{$_} } @nam )

    If not, then trim both your input files to the bare minimum of data needed to demonstrate the failing behavior, and post them in this thread.

      Same problem is coming again.....I trimmed the input files to minimum +also..........
        Try posting your revised code along with sample name500 and name_sym files, with just a few lines in each, and your expected and actual match and nomatch files.