FarTech has asked for the wisdom of the Perl Monks concerning the following question:
In the following code,i call fasta34.exe which ask u to enter some values such as sequence file name, library file name etc(all this file should be stored in perl folder).It generates a output file with this input.
Parse the output file and get the sequence which satisfies the threshold value.
i m trying to store each sequences in an array with one index value, since the sequences contains multiple lines it stores only the last line of the sequences.
my main problem here is : to store each sequence in a temporary file and give it as a input to the fasta34.exe by recursively calling it and automate all the input values to fasta34.exe, since each time the same values are entered
can anybody help me in modifying the code.
Thanks Farid
********************************#!usr/bin/perl -w print"\n********Running Fasta34********** \n \n"; print"\n Please enter the file sequence and library present in your pa +th \n\n"; $result = system("c:/perl/sam/fasta34.exe"); print"\n\n*****enter the output file name you have given***\n"; $output = <STDIN>; open(FASTA,"c:/perl/sam/$output") or die "cant open the output file \n +"; #Make this a function #Function1 #open(FASTA,"c:/perl/sam/$output"); #@ tempseq = <FASTA>; #$flag=0; my $var; my $name; my $match; my $seqname; my @input; my $iteration; my $x = 0; print "Enter your cut off percentage"; $cut = <STDIN>; while (<FASTA>) { #compare the line of matching sequence if($name = m/>>(.{4,6})(.*)/g) { $seqname = $1; $laterhalf = $2; } #print "SKIP A LINE iF A MATCH \n"; next if /^ini/; if ($per = m/(\d+\.\d+)% identity/g) { $per = $1; #check if match is above the cutoff percentage if ($per > $cut) { print "\n\n$seqname$laterhalf \n";\ print "identity match $per % \n"; #store the first line of the input file $input[$x] = "$seqname$laterhalf"; while(<FASTA>) { if ($match = m/^($seqname)(.*)/) { #store the match sequence print "$2 \n"; $input[$x] = $2; } if(m/>>/g) { print @input[$x]; #open(OUT2IN,">c:/perl/sam/output1"); #print OUT2IN @input; #$iteration = system("OUT2IN"); $x = $x + 1; seek(FASTA,-100,1); last; } } } } } close (FASTA); print $result;
sample fasta output file
**********************************
********************************# c:\perl\sam\fasta34.exe FASTA searches a protein or DNA sequence data bank version 3.4t21 May 14, 2003 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 hahu.aa, 141 aa vs mu.lib library 1776 residues in 7 sequences Altschul/Gish params: n0: 141 Lambda: 0.158 K: 0.019 H: 0.100 FASTA (3.45an0 Mar 2002) function [optimized, BL50 matrix (15:-5)] ktu +p: 2 join: 36, opt: 24, open/ext: -10/-2, width: 16 The best scores are: opt bits E( +7) GTM1_MOUSE GLUTATHIONE S-TRANSFERASE GT8.7 (EC 2. ( 217) 51 18.7 + 0.48 GTM1_RAT GLUTATHIONE S-TRANSFERASE YB1 (EC 2.5.1. ( 217) 42 16.7 + 1.8 GTMU_CRILO GLUTATHIONE S-TRANSFERASE Y1 (EC 2.5.1 ( 217) 41 16.4 + 2 GTMU_RABIT GLUTATHIONE S-TRANSFERASE MU 1 (EC 2.5 ( 217) 37 15.5 + 3.4 GTM1_HUMAN GLUTATHIONE S-TRANSFERASE MU 1 (EC 2.5 ( 217) 35 15.1 + 4.1 GLNA_ANASP GLUTAMINE SYNTHETASE (EC 6.3.1.2) (GLU ( 473) 35 15.0 + 6.1 GTM4_HUMAN GLUTATHIONE S-TRANSFERASE MU 4 (EC 2.5 ( 218) 28 13.5 + 6.5 >>GTM1_MOUSE GLUTATHIONE S-TRANSFERASE GT8.7 (EC 2.5.1.1 (217 aa) initn: 40 init1: 40 opt: 51 Z-score: 66.1 bits: 18.7 E(): 0.48 Smith-Waterman score: 51; 25.641% identity (27.027% ungapped) in 39 a +a overlap (35-72:176-213) 10 20 30 40 50 60 HAHU ADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFD-LSHGSAQVKGHGKKVA .::. . .. .:. :.. :: .:. .. .: GTM1_M WFAGDKVTYVDFLAYDILDQYRMFEPKCLDAFPNLRDFLARFEGLKKISAYMKS-SRYIA 150 160 170 180 190 200 70 80 90 100 110 120 HAHU DALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHA . . .:: GTM1_M TPIFSKMAHWSNK 210 >>GTM1_RAT GLUTATHIONE S-TRANSFERASE YB1 (EC 2.5.1.18) ( (217 aa) initn: 36 init1: 36 opt: 42 Z-score: 55.0 bits: 16.7 E(): 1.8 Smith-Waterman score: 42; 34.783% identity (36.364% ungapped) in 23 a +a overlap (35-56:176-198) 10 20 30 40 50 60 HAHU ADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFD-LSHGSAQVKGHGKKVA .::. : .. .:. :.. :: .:. GTM1_R WFAGDKVTYVDFLAYDILDQYHIFEPKCLDAFPNLKDFLARFEGLKKISAYMKSSRYLST 150 160 170 180 190 200 70 80 90 100 110 120 HAHU DALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHA GTM1_R PIFSKLAQWSNK 210 >>GTMU_CRILO GLUTATHIONE S-TRANSFERASE Y1 (EC 2.5.1.18) (217 aa) initn: 36 init1: 36 opt: 41 Z-score: 53.8 bits: 16.4 E(): 2 Smith-Waterman score: 41; 36.364% identity (38.095% ungapped) in 22 a +a overlap (36-56:177-198) 10 20 30 40 50 60 HAHU DKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFD-LSHGSAQVKGHGKKVAD ::. : .. .:. :.. :: .: GTMU_C FAGDKVTLCGFLAYDVLDQYQMFEPKCLDPFPNLKDFLARFEGLKKISAYMKTSRFLRRP 150 160 170 180 190 200 70 80 90 100 110 120 HAHU ALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHAS GTMU_C IFSKMAQWSNK 210 >>GTMU_RABIT GLUTATHIONE S-TRANSFERASE MU 1 (EC 2.5.1.18 (217 aa) initn: 32 init1: 32 opt: 37 Z-score: 48.8 bits: 15.5 E(): 3.4 Smith-Waterman score: 44; 29.508% identity (33.962% ungapped) in 61 a +a overlap (31-91:17-69) 10 20 30 40 50 60 HAHU VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGK ::.: . : : . . . :.: ..: GTMU_R PMTLGYWDVRGLALPIRMLLEY--TDTSYEEKKYTMGDAPNYDQSK 10 20 30 40 70 80 90 100 110 120 HAHU KVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPA ... .: .. :.:: : : : .::: GTMU_R WLSEKFTLGL----DFPN-LPYLID-GTHKLTQSNAILRYLARKHGLCGETEEERIRVDI 50 60 70 80 90 130 140 HAHU VHASLDKFLASVSTVLTSKYR GTMU_R LENQLMDNRFQLVNVCYSPDFEKLKPEYLKGLPEKLQLYSQFLGSLPWFAGDKITFADFL 100 110 120 130 140 150 >>GTM1_HUMAN GLUTATHIONE S-TRANSFERASE MU 1 (EC 2.5.1.18 (217 aa) initn: 28 init1: 28 opt: 35 Z-score: 46.4 bits: 15.1 E(): 4.1 Smith-Waterman score: 41; 30.769% identity (36.364% ungapped) in 39 a +a overlap (24-56:160-198) 10 20 30 40 HAHU VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFL-----SFPTTKTYFPHFD- : . :.:.: .::. : .. .:. GTM1_H LPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEG 130 140 150 160 170 180 50 60 70 80 90 100 HAHU LSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLV : . :: .:. GTM1_H LEKISAYMKSSRFLPRPVFSKMAVWGNK 190 200 210 >>GLNA_ANASP GLUTAMINE SYNTHETASE (EC 6.3.1.2) (GLUTAMAT (473 aa) initn: 31 init1: 31 opt: 35 Z-score: 39.7 bits: 15.0 E(): 6.1 Smith-Waterman score: 38; 30.435% identity (31.818% ungapped) in 23 a +a overlap (96-118:447-468) 70 80 90 100 110 120 HAHU LTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASL ...:: .. . : : : ::. GLNA_A SLELALEALENDHAFLTDTGVFTEDFIQNWIDYKLANEVKQMQLRPH-PYEFSIYYDV 420 430 440 450 460 470 130 140 HAHU DKFLASVSTVLTSKYR >>GTM4_HUMAN GLUTATHIONE S-TRANSFERASE MU 4 (EC 2.5.1.18 (218 aa) initn: 28 init1: 28 opt: 28 Z-score: 37.7 bits: 13.5 E(): 6.5 Smith-Waterman score: 41; 30.769% identity (36.364% ungapped) in 39 a +a overlap (24-56:161-199) 10 20 30 40 HAHU VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFL-----SFPTTKTYFPHFD- : . :.:.: .::. : .. .:. GTM4_H LPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPNLKDFISRFEG 140 150 160 170 180 190 50 60 70 80 90 100 HAHU LSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLV : . :: .:. GTM4_H LEKISAYMKSSRFLPKPLYTRVAVWGNK 200 210 141 residues in 1 query sequences 1776 residues in 7 library sequences Scomplib [34t21] start: Tue Jan 18 16:24:00 2005 done: Tue Jan 18 16:24:33 2005 Scan time: 0.000 Display time: 14.000 Function used was FASTA [version 3.4t21 May 14, 2003]
output of the code ***********************************
*****enter the output file name you have given*** g.fasta Enter your cut off percentage30.00 GTM1_RAT GLUTATHIONE S-TRANSFERASE YB1 (EC 2.5.1.18) ( (217 aa) identity match 34.783 % WFAGDKVTYVDFLAYDILDQYHIFEPKCLDAFPNLKDFLARFEGLKKISAYMKSSRYLST PIFSKLAQWSNK PIFSKLAQWSNK GTMU_CRILO GLUTATHIONE S-TRANSFERASE Y1 (EC 2.5.1.18) (217 aa) identity match 36.364 % FAGDKVTLCGFLAYDVLDQYQMFEPKCLDPFPNLKDFLARFEGLKKISAYMKTSRFLRRP IFSKMAQWSNK IFSKMAQWSNK GTM1_HUMAN GLUTATHIONE S-TRANSFERASE MU 1 (EC 2.5.1.18 (217 aa) identity match 30.769 % LPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEG LEKISAYMKSSRFLPRPVFSKMAVWGNK LEKISAYMKSSRFLPRPVFSKMAVWGNK GLNA_ANASP GLUTAMINE SYNTHETASE (EC 6.3.1.2) (GLUTAMAT (473 aa) identity match 30.435 % SLELALEALENDHAFLTDTGVFTEDFIQNWIDYKLANEVKQMQLRPH-PYEFSIYYDV SLELALEALENDHAFLTDTGVFTEDFIQNWIDYKLANEVKQMQLRPH-PYEFSIYYDV GTM4_HUMAN GLUTATHIONE S-TRANSFERASE MU 4 (EC 2.5.1.18 (218 aa) identity match 30.769 % LPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPNLKDFISRFEG LEKISAYMKSSRFLPKPLYTRVAVWGNK 0
Edit by castaway - added all sorts of formatting tags
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Automating the input to exe file
by renz (Scribe) on Jan 26, 2005 at 19:22 UTC | |
|
Re: Automating the input to exe file
by ikegami (Patriarch) on Jan 26, 2005 at 20:34 UTC | |
|
Command line option + Fasta
by FarTech (Novice) on Jan 27, 2005 at 01:24 UTC | |
|
Re: Automating the input to exe file
by Solo (Deacon) on Jan 26, 2005 at 20:25 UTC | |
|
Re: Automating the input to exe file
by injunjoel (Priest) on Jan 26, 2005 at 20:40 UTC | |
by ikegami (Patriarch) on Jan 26, 2005 at 20:42 UTC |