AWallBuilder has asked for the wisdom of the Perl Monks concerning the following question:

Dear all, I have a script that I wrote that had many lines of code repeated so I decided to try and make a subroutine (getDetails). However, now the script takes MUCH longer to run, and I am getting a warning about checking to see if a hash index exists (error producing the warning is marked by ***). At the moment I pass the hash to the subroutine0 as a reference, and maybe this is where my errors arise. Any help would be appreciated.

*** use of uninitialized value in exists

######## now read in taxid2locus file and print out taxids for host ## easier for trees and solves problem for wgs genomes my $tax2locus_file="/g/bork6/waller/Viruses/prophage_data/tax2locus.da +t"; my %tax2loc; open (IN,$tax2locus_file) or die "cannot open $tax2locus_file\n"; while(<IN>){ chomp; my($taxid,$locus)=split(/\t/,$_); $tax2loc{$locus}=$taxid; } close(IN); print "there are\t".scalar(keys %tax2loc)."\tlocus_ids as key in hash\ +n"; ########### subroutine get host and taxid ### sub getDetails() { my ($proph,$lu)=@_; my $host;my $taxid;my $genNum_proph; # print "\t$proph\tproph inside sub"; my @proph_columns=split(/\_/,$proph); # get host info if ($proph_columns[0] =~ /^FP92/){ $host=$proph_columns[0]; } #elsif ($proph_columns[0] =~ /^NZ/){ # $host=join("_",$proph_columns[0],$proph_columns[1]); # $host=substr $host, 0, 7; #} else{ $host=join("_",$proph_columns[0],$proph_columns[1]); } # print "\t$host\thost from inside sub"; #get taxid my @matching_keys= grep { $_=~ /^$host/ } keys %$lu; my $matching_key=$matching_keys[0]; if (exists $lu->{$matching_key}){ *** $taxid=$lu->{$matching_key}; # print "\t$taxid\ttaxid from inside sub\t"; } elsif (!exists $lu->{$matching_key}){ $taxid="undef"; } ## get number of proteins in the prophage if (exists $genesPerProphGen{$proph}){ $genNum_proph=$genesPerProphGen{$proph}; # print "\t$genNum_proph\tfrom inside sub\n" } else{ print "$proph doesn't have number of genes\n" } return ($host,$taxid,$genNum_proph); } ##### Now make table with numbers ##### my $shared_table="$mci_file.JaccHost.tabv2"; my $genNum_prophA;my $genNum_prophB;my $host_prophA;my $host_prophB; open (OUTs,">$shared_table"); print OUTs "# This table was generated using $0 on\t".scalar(localtime +(time))."\n"; print OUTs "#".join("\t",qw(Proph_genomeA Proph_genomeB hostA taxidA h +ostB taxidB Jacc Prots_Shared Protsin_prophA Protsin_prophB))."\n"; foreach my $prophA (keys %$overlap){ print "$prophA\tfrom overlap\t"; my($hostA,$taxidA,$genNum_prophA)=&getDetails($prophA,\%tax2lo +c); # print "from sub\t $hostA,$taxidA,$genNum_prophA\n"; foreach my $prophB (keys %{$overlap->{$prophA}}){ my($hostB,$taxidB,$genNum_prophB)=&getDetails($prophB, +\%tax2loc); # print "from sub\t $hostB,$taxidB,$genNum_prophB\n"; my $A_notShared=$genNum_prophA-$overlap->{$prophA}{$pr +ophB}; my $B_notShared=$genNum_prophB-$overlap->{$prophA}{$pr +ophB}; my $Jac_denom=$overlap->{$prophA}{$prophB}+$A_notShare +d+$B_notShared; my $Jac=$overlap->{$prophA}{$prophB}/$Jac_denom; print OUTs join("\t",$prophA,$prophB,$hostA,$taxidA,$h +ostB,$taxidB,$Jac,$overlap->{$prophA}{$prophB},$genNum_prophA,$genNum +_prophB)."\n"; } } close(OUTs);

Replies are listed 'Best First'.
Re: problem with subroutine and passing hash
by moritz (Cardinal) on Jul 05, 2012 at 08:22 UTC

    You have asked 15 questions here, and either don't know how to debug and fix your programs, or you're too lazy to do it. That needs to change. Learn it and practise it, it's an essential skill in programming.

    As a hint, the warning means that $matching_key is undef. So start to print interesting variables in the code, and compare them to what you would expect them to contain, until you find the root cause of your error.

    Data::Dumper can help you with printing your data structures.

Re: problem with subroutine and passing hash
by Anonymous Monk on Jul 05, 2012 at 08:48 UTC

    You need more whitespace , use perltidy

    Don't use prototypes , you don't need'em ( see Modern Perl about that). Instead of  sub getDetails() { ... write this  sub getDetails { ...

    You need more subroutines

    Main( @ARGV ); exit( 0 ); sub Main { my( $file ) = @_; my $locomotion = SlurpLocomotion( $file ); TableExpel( $outfile, $locomition ); } sub TableExpel { ... my ( $do, $re, $mi ) = getDetails( $locomotion ); ... }

    You're using $host as a regular expression. It might not matter with your current data , but don't treat $var as a regular expression unless it is a regular expression. Escape it. Use quotemeta (  /^\Q$var\E/ ).

    Your program doesn't compile, its incomplete and you're missing sample input -- in short, its impossible to debug.

    If all you want to debug is getDetails, like Basic debugging checklist teaches, dump\@_ some sample data from getDetails and post a short representative portion ( How do I post a question effectively? ).