Victory! I made the code work properly. Nevertheless, I do probably have efficiency problems. I would appreciate if you can comment on points where I can improve my code (in terms of both performance and appropriate style). Thanks guys, I could not do it without your help!

Here is the final ugly code. #!/usr/bin/perl -w use strict; use Data::Dumper; my %info = (); my ($gi, $humangi, $accession); my $data = '/DATA/proteinfile.txt'; open INFILE, '<', $data or die "Failed at opening $data!\n"; # Construct the hash with GIs as keys and sequences as values while ( <INFILE> ) { my $line = $_; chomp($line); last if m!END!; if($line=~m/HUMAN/){ ($humangi) = ($line=~m/^\S+\|(\d+)/); ($accession) = ($line=~m/^\S+\|\d+\|\w+\|(\S{6}?)/); } if($line=~m/^\S+\|(\d+)/) { if(defined($1)) { $gi=$1; } } else { $info{$gi} = $line; } } #print Dumper (\%info); print "$humangi\n"; print "$accession\n"; close(INFILE); my $data2 = '/DATA/variantlist.txt'; open INFILE2, '<', $data2 or die "Failed at opening $data2!\n"; my $data3 = '/DATA/VariantOutput.txt'; open OUTFILE, '>', $data3 or die "Failed at opening $data3!\n"; print OUTFILE "This is [GI: $humangi] and [Accession: $accession]\nVAR +IANT\t\tPOTENTIAL\t\tPD\n"; while ( <INFILE2>){ # Grab a variant from the file (in this example: P82L) my $line2 = $_; chomp($line2); my $Variant = $line2; # Split the variant into three parts my ($source, $position, $sink) = split(/(\d+)(\w)/, $Variant); #print "$source\t$position\t$sink\n"; # Check whether HS has the source (i.e., P) at the given position (i.e +., 82) my $temp = $info{$humangi}; #print "Temp contains $temp" . "\n"; my @char = split //, $temp; #print "Now \@char contains: @char"; #print "Inside the temp: $char[0] and $char[1]\n"; my $target = $char[$position-1]; #print "This is the target: $target" . "\n"; if ( $target eq $source) { print "Yep!\n"; } my @VariantList = (); my @PDList = (); # Scan the rest of the sequences to check what amino acid they have at + the given position for my $gi ( keys %info ) { my $value = $info{$gi}; my @char2 = split //, $value; my $potential = $char2[$position-1]; push (@VariantList, $potential); if ($potential eq $sink){ # Note the cases where we observe th +e sink (i.e., L) at this position my $pd = "$potential" . "{" . "$gi" . "}"; push (@PDList, $pd) #print "A pathogenic deviation has been found at site $pos +ition - from $source to $sink !\n" . " And the corresponding gi for t +his deviation is: $gi\n"; } } print OUTFILE "$Variant\t\t@VariantList\t\t@PDList\n"; } close(INFILE2);


In reply to Re^4: Use of uninitialized value in string eq by sophix
in thread Use of uninitialized value in string eq by sophix

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.