in reply to Re^2: Could not open
in thread Could not open
This is probably because your perl script is creating the data file that is being used as input in the second and consecutive runs of fasta34, and that data file is being formatted incorrectly.
In order to debug this, try modifying your script so that it does the first fast34 run, processes the output of that run, writes the input file for the next run, and then exits.
Run that version of the script, then compare the original fasta input file to your script's output file (the one that would be used as input for the next fast34 run). If there are differences in how the data are formatted, you need to figure out why your script is producing a different format, and fix that.
Try manually running fasta34 at the command line, using your script's output as the fasta input, and see what happens. Maybe it will give you a useful error message.
If your initial fasta output file is okay, and you still have trouble doing the second and subsequent fasta runs, then the problem is probably the flow-control in the later part of your script -- the looping and/or conditional statements that guide the later fasta34 runs might not be doing what you think/want. For that, use the perl debugger (start the script using "perl -d script_name arg1 arg2 arg3") and step through that part of the logic. If you haven't used the perl debugger, it's easy to learn and quite valuable to use; see "perldoc perldebug".
you also mentioned about hash, can u brief me out how we can store such file in hash and then compare similar files>GTM1_RAT GLUTATHIONE S-TRANSFERASE YB1 (EC 2.5.1.18) ( (217 aa) WFAGDKVTYVDFLAYDILDQYHIFEPKCLDAFPNLKDFLARFEGLKKISAYMKSSRYLST PIFSKLAQWSNK >GTMU_CRILO GLUTATHIONE S-TRANSFERASE Y1 (EC 2.5.1.18) (217 aa) FAGDKVTLCGFLAYDVLDQYQMFEPKCLDPFPNLKDFLARFEGLKKISAYMKTSRFLRRP IFSKMAQWSNK >GTM1_HUMAN GLUTATHIONE S-TRANSFERASE MU 1 (EC 2.5.1.18 (217 aa) LPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEG LEKISAYMKSSRFLPRPVFSKMAVWGNK >GLNA_ANASP GLUTAMINE SYNTHETASE (EC 6.3.1.2) (GLUTAMAT (473 aa) SLELALEALENDHAFLTDTGVFTEDFIQNWIDYKLANEVKQMQLRPH-PYEFSIYYDV >GTM4_HUMAN GLUTATHIONE S-TRANSFERASE MU 4 (EC 2.5.1.18 (218 aa) LPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPNLKDFISRFEG LEKISAYMKSSRFLPKPLYTRVAVWGNK
If this represents a valid data file for fasta34, then I would guess that the file format could be described as follows:
Here's an example of a script that would test my assumptions about the data format; it shows how to use a hash to keep track of records as they are read in; if your store this script in a file called "test-format.pl", and have a set of fasta data files named "*.fasta", you can run this script with the command line "perl test-format.pl *.fasta":
It prints warning messages to STDERR if my assumptions turn out to be wrong for a given set of input. Try it out in order to learn more about your data.#!/usr/bin/perl # Check fasta file format, report cases that don't meet expectations use strict; use warnings; my %records; $/ = ''; # special Perl rule: setting $/ to empty string # means: treat each sequence of one or more blank lines # as a record separator while (<>) { my @lines = split /[\r\n]+/; # split on line-breaks my $key = shift @lines; # first line should be the identifier if ( $key !~ /^>/ ) { warn "record $. does not start with '>':\n$_\n"; } my $data = ''; for my $line ( @lines ) { warn "record $. contains odd data:\n$_\n" unless ( $line =~ /^ + [-A-Z]+\s*$/ ); $data .= $line."\n"; # (remember to put linefeeds back) } if ( exists( $records{$key} )) # have we seen this identifyier b +efore? { warn "key $key found on different data:\n$data\n\n$records{$ke +y}\n\n" if ( $records{$key} ne $data ); } else # we haven't seen this value of $key before { $records{$key} = $data; } } printf STDERR "%d records checked\n", scalar keys %records; # just for fun, let's try to output some data: for my $chosen ( grep /^>GTM1_/, keys %records ) { print join "\n", $chosen, $records{$chosen}, ''; }
You can also use redirection on the command line to save the sample output data:
Then see if test.out works as input to fasta34; if it doesn't, then I expect the sample file data you provided won't work either, and you'll need to figure out why.perl test-format.pl *.fasta > test.out
|
|---|