comment on

Dear Perl Monks,

First let me say I am new to perl, I am using it for bioinformatics/genetic analysis.

What I want my script to do:
I have an IDs file in tab format, it looks like this (with info between slashes representing columns)

/Unwated/ ID required/ Unwanted1/ Unwanted2/...Unwanted6/ ID_alias/ ID_alias1/ ... ID_alias36/

I also have a gene names file, for which I want to return the official ID. The gene name may be in any one of the alias columns.

Gene names file looks like this

/Info/ Gene name/ Info1/ Info 2... Should be simple right?

What my script does:
It returns an empty output file.

I will post it here in the hope one of you learned people will be able to spot the mistakes. I am using dummy files to get it working (see below for examples).

Please excuse extensive comments in my script, like I said I am new to this.

Thank you in advance,
-Walking before I can run...

Script

################################################
# Declare an outfile to print to
my $outfile = "HUGO_dummyResults.txt";

# Open the outfile using a file handle
open( OUT, "> $outfile" ) or die "cannot create the output file";

#################################################

# Open file of list of neurotransmission genes where ENST has not been
+ found

# FILENOTES::: File created in access using a query against approved H
+UGO name and gene name.
#Column 3 of file [2] is gene name col 4 [3] is the pathway gene is as
+sociated with

open (DUMMY_GENEFILE, 'DummyGenes.txt') or die "cannot open file conta
+ining genes";

#################################################

# 2- Open HUGO tabbed file
# FILENOTES::::    Approved gene name is in col 2 [1]

open (DUMMYHUGO,'DummyHugo.txt') or die "cannot open file containing H
+UGO IDs";

#################################################
#Operations
#################################################

#make array genes
#@genes = DUMMY_GENEFILE;        #No longer done here, see below

#make array HUGO
@hugo = DUMMYHUGO;

#for each line in genefile, try to match gene name [2] to one column o
+f the columns [5]-[8] in the HUGO ID file. 

#check col 6, if found print, if not found, check next column. If neve
+r found, print "not found".

foreach (<DUMMY_GENEFILE>) #Changed from (<DUMMY_GENEFILE>)

{
    #make array genes
    @genes = DUMMY_GENEFILE;
    
    for ($i = 4; $i < @hugo; $i++)
    
    {
        if ($genes[2] eq $hugo[$i])

            #If found first print result
                
                {
                        
                    print OUT "$genes[0]\t$genes[1]\t$genes[2]\tgenes[
+3]\t$hugo[1]\n";
                    
                                
                }
            
            # HUGO ID not found, print
            print OUT "$genes[0]\t$genes[1]\t$genes[2]\tgenes[3]\tNo H
+UGO ID\n";
            
            
    }    
    
    
}
close (DUMMYHUGO);
close (DUMMY_GENEFILE);
close (OUT);
exit;
[download]

________________
Here are some dummy files to help demonstrate.

DummyHugo.txt

HGCNID:1 SKJ Info1 Info2 Info3 Sandra San Katey Jones
HGCNID:2 DJL Info1 Info2 Info3 Dave David James London
HGCNID:3 PKKJ INfo1 INfo2 INfo3 Paul Kevin Kean June
HGCNID:4 KJRJ INfo1 Info2 INfo3 Katie Joanna Rachel Jolie

DummyGenes.txt

ID1 Id2 Katie Path
ID1a Id2a Dave Path
ID1b Id2b Kean Path
ID1c Id2c Paul Path
ID1d Id2d Sandra Path
____________________________

In reply to (Failing) script to return an official ID by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.