comment on

This is part of project I am working on. Here's the main paradigm I am putting Genomes through a blast program, and doing a reciprocal blast hit on the same genomes. So if I do, Genome A against Genome B, then I am doing Genome B to Genome A. Basically I am seeing if the the First Genome A, matches the second Genome A. Example: A.B.blast B.A.blast Seeing wether the A's are identical. If they are identical, the both files names get recorded on a txt file. Now I got the script working, but it either keeps stagnating on a single file within the dual while loops. I have to do this with about 50 other files. Its been really tricky to get this thing going. All that really needs to happen is a simple comparison. Any advice would be greatly appreciated. If their is a simplier way of doing this, then I am all for it. Its just that I am limited to just using Using strict and warnings, so I cant use anything else. I am using Perl 5

Update; here are the inputs from the file,

Genome A to Genome B contents include.

gi|110123922|gb|EC817325.1|EC817325 gi|110095377|gb|EC788780.1|EC788780

gi|110123921|gb|EC817324.1|EC817324 gi|110105430|gb|EC798833.1|EC798833 6

gi|110123920|gb|EC817323.1|EC817323 gi|110106464|gb|EC799867.1|EC799867

In this file the first gi number in the line, represents genome A, and the second gi number in the line represents genome .B

gi|110104773|gb|EC798176.1|EC798176 gi|110119622|gb|EC813025.1|EC813025

In this file the first gi number represents Genome B, and the second gi number represents Genome A.

This is where it really gets confusing

#!/usr/bin/env perl
use strict;
use warnings;

my @a;
my $n = "\n";
@a = glob("*.Recip.blast.top");
my $t = "\t";
foreach my $a (@a){
#  print $a."\n";
  my @b = split(/[.]/,$a);
  #print @b.$n;
  #print $b[0].$n.$b[1].$n.$b[2].$n.$b[3].$n.$b[4].$n.$b[5].$n.$b[6].$
+n.$b[7];
  my $OrginalBlast;
  $OrginalBlast = $b[0].".".$b[1].".".$b[2].".".$b[3].".".$b[4].".".$b
+[5].".".$b[6].".".$b[7];
#  print $OrginalBlast;
#  print $n;

#####ORGN is = AB|||||||


  open(ORGN,"/home/ajl12013/Labwork/Dbfiles/results/$OrginalBlast") ||
+ die;

######RECI is = BA|||||||

  open(RECI,$a) || die;
  open(OUTM,">MatchingGI.txt" ) || die;
  open(OUTNoM,">NoMatchingGI.txt") || die;
  my $ORGN = <ORGN>;
  #The scalar format works here, but for some reason, the <RECI> doesn
+t print, this makes debugging the whole process very difficult and te
+dious.
  my $RECI = <RECI>;
  #print <ORGN>;
  #print "############################################################
+##################################";
  #print $RECI;
  #print <RECI>;
  my %GenomeTable;
my $Genome1A ;
  my $G1A;
  my $data;
  my $G2A;
  my $Genome2A;
  my $l;
  local $/;
  my $key;
$GenomeTable{$OrginalBlast} = $a ;

#print keys %GenomeTable;
#print $n;
#print values %GenomeTable;
#print $n;
#### First while statement for the AB file.
$GenomeTable{$n} = $n;

while(my $G1A = <ORGN>){
  # my @G1A = $G1A =~ m/^[gi]\d[|]$/g;
  #print $G1A;
  #print $G1A[0];
  #foreach my $q (@G1A){
  #  print $q
  #chomp($G1A);
  if ($G1A =~ m/^[gi|]\w/){
  ($Genome1A) = $G1A =~ m/^gi\|\d+/g;
  $GenomeTable{$Genome1A}  = [];
  $GenomeTable{$n};
  #print "These are the keys: ";
  #print  keys %GenomeTable ;
  #print $n;
#my $GenomeAB_ref;
#$GenomeAB_ref->$GenomeTable{$Genome1A};
#print $GenomeAB_ref;
  foreach my $keys (keys %GenomeTable){ print " Keys: $keys $n";}
}
 while($G2A = $RECI ){
    #print $G2A;
    #chomp($G2A);
    if($G2A =~ m/^[gi|]/){
      # print "Here";
       $G2A =~ m/^gi\|\w+\|\w+\|\w+\.\d\|\w+\s(gi\|\d+)/g;
      $Genome2A = $1;
       # ($l) = $G2A =~ m/^gi\|\d+/g;
     #$G2A =~ m/^gi\|\w+\|\w+\|\w+\.\d\|\w+\s(gi\|\d+)/g;
     #print $l;
     #$Genome2A = $1;
       #print $Genome2A;
    $GenomeTable{$Genome1A} = $Genome2A;
    #print "These are the valuesi: ";
    #print  values %GenomeTable   ;
    #print $n;
  foreach my $values (values %GenomeTable){print "Values: $values $n";
+}


  $Genome1A eq $Genome2A ? print OUTM $a.$n : print OUTNoM $a.$n;

  #exit;
}
#exit;
}
#exit;
}
}
[download]

In reply to How to do a reciprocal matching statement by ajl412860

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.