Dear toolic,
This script works beautifully on my dummy data.
When I run it using my real files I get an error message that repeats itself line after line until I stop it.
Use of uninitialized value in string eq at HUGOID_extract.pl line 50, <$GENEFILE> line 1.
Line 50 is -
if ($genes2 eq $hugo$i) -
I am confused as to why it would work on dummy data but not real.
My only thoughts are that in the IDs file (DUMMYHUGO) there are up to 28 columns with aliases for genes names (for which I want to return the HUGO ID (hopefully this is clear from my initial post). However, not all gene names have 28 alises.
With the dummy data, there was always an 'eq' for the gene name. In the real files there may not be. Is it possible that if there is no match in the DUMMYHUGO file that the script doesn't know how to move on?
I have added some code after your if loop (below)
if ($genes[2] eq $hugo[$i]) {
print $OUT "$genes[0]\t$genes[1]\t$genes[2]\t$genes[3]
+\t$hugo[1]\n";
})
My code is as follows (looks a little squiffy here but still):
#added by me
else
{
print $OUT "$genes[0]\t$genes[1]\t$genes[2]\t$
+genes[3]\tHUGO_notfound\n";
$i++;
}
I still get the infinite error though.
Any further ideas that can help?
Thanks again.