Nicpetbio23! has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am trying to do a tblastn search within a perl script.

 # one genome one query unix command
 my ($failcheck) = system(“tblastn -query -db -genome -out output.blast.tab”);
 die(“Error: could not execute system command\n”) if ($failcheck != 0);

My main question is how do I change my unix command so that I can go through a list of query and genome sequences and return the tblastn results for each?

Eventually I plan to look for sequences that I expect to find in the protein searches, which are similar to those found in close relatives of the genome I am searching. Although this may not be relevant to the question above.

Replies are listed 'Best First'.
Re: tblastn unix
by 1nickt (Canon) on Jun 27, 2017 at 20:09 UTC

    My main question is how do I change my unix command so that I can go through a list of query and genome sequences and return the tblastn results for each?

    You don't (assuming you are asking here because you want to learn how to use Perl). You call it in a loop.

    See for and while, foreach, perlsyn.

    Hope this helps!


    The way forward always starts with a minimal test.
Re: tblastn unix
by thanos1983 (Parson) on Jun 27, 2017 at 22:49 UTC

    Hello Nicpetbio23!,

    I would like to add a few points to your question My main question is how do I change my unix command so that I can go through a list of query and genome sequences and return the tblastn results for each?. I would start by saying do not use system, why? Simply written in the documentation, I quote:

    This is not what you want to use to capture the output from a command +; for that you should use merely backticks or qx//, as described in ` +STRING` in perlop.

    So back to your question, I assume that you want to capture the output of your command through a loop as the fellow monk 1nickt replied to your question already.

    So in conclusion, I spend a small amount of time to install the blast on my linuxOS following this great tutorial Bioinformatics: introduction to using BLAST with Ubuntu.

    After completing the installation and creating the db for testing purposes, I created a small script to demonstrate how to use backticks with tblastn.

    #!usr/bin/perl use say; use strict; use warnings; my $dir_before = `ls -la`; chomp $dir_before; say $dir_before; # execute your linux command with backticks `blastn -query example.fa -db /home/tinyos/blast_db/refseq_rna.00 -out + results.txt`; my $dir_after = `ls -la`; chomp $dir_after; say $dir_after; __END__ $ perl bio.pl total 20 drwxr-xr-x 2 tinyos tinyos 4096 Jun 28 00:30 . drwxr-xr-x 7 tinyos tinyos 4096 Jun 28 00:19 .. -rw-r--r-- 1 tinyos tinyos 279 Jun 28 00:30 bio.pl -rw-r--r-- 1 tinyos tinyos 0 Jun 28 00:21 bio.pl~ -rw-r--r-- 1 tinyos tinyos 459 Jun 28 00:04 example.fa -rw-r--r-- 1 tinyos tinyos 458 Jun 28 00:04 example.fa~ total 32 drwxr-xr-x 2 tinyos tinyos 4096 Jun 28 00:30 . drwxr-xr-x 7 tinyos tinyos 4096 Jun 28 00:19 .. -rw-r--r-- 1 tinyos tinyos 279 Jun 28 00:30 bio.pl -rw-r--r-- 1 tinyos tinyos 0 Jun 28 00:21 bio.pl~ -rw-r--r-- 1 tinyos tinyos 459 Jun 28 00:04 example.fa -rw-r--r-- 1 tinyos tinyos 458 Jun 28 00:04 example.fa~ -rw-r--r-- 1 tinyos tinyos 10739 Jun 28 00:30 results.txt

    Since the blastn does not return anything I check my directory again. I assume your command is returning something? If this is the case then backticks, if not create a loop as the fellow monk 1nickt pointed out and use system every time to check if your command was correct. Then find a way to open the output and do something with the file. So voila you are done :D

    Hope this helps, BR

    Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: tblastn unix
by kevbot (Vicar) on Jun 28, 2017 at 07:12 UTC
    Hello Nicpetbio23!,

    The following example uses a couple CPAN modules (Path::Tiny, and Capture::Tiny). The syntax of your tblast command does not seem to be quite right. Detailed descriptions for tblastn command line arguments can be obtained with tblastn -help. I modified your tblastn command based on my reading of the help output.

    This code assumes that there are fasta query files in the current directory with .fa file extensions. For each fasta file found, an output file will be generated (with a .txt extension).

    NOTE: I did not test this code thoroughly. I commented out the call to system to be sure that the intended $cmd string is generated, but I did not run the tblastn commands.

    #!/usr/bin/env perl use strict; use warnings; use Path::Tiny; use Capture::Tiny ':all'; my @paths = path('./')->children( qr/\.fa$/ ); foreach my $file_path (@paths){ my $query_file = $file_path->stringify; my $output_file = path( $file_path->parent, $file_path->basename( '.fa' ).'_output.txt' )->stringify; my $cmd = "tblastn -query $query_file -db genome.db". " -out $output_file"; print "Query file: $file_path\n"; print "\t $cmd\n"; my ($stdout, $stderr, $exit) = capture { system( $cmd ); }; if ($exit != 0){ die "Error: command failed"; } } exit;
Re: tblastn unix
by afoken (Chancellor) on Jun 28, 2017 at 17:43 UTC
    system(“tblastn -query -db -genome -out output.blast.tab”);

    Try to avoid using single-argument system:

    With your current arguments, it may be harmless. But as soon as you add variable interpolation, you will also want to add quoting to prevent the default shell to mess with your arguments. And at that point, you simply can no longer win the game. There is no single default shell for all environments, but each OS and each OS version has a different default shell with different rules for parsing and quoting. Whatever you do, your code will break on some systems. On other system, it may run just accidentally. The same is true for `` and its longer form, qx(). If you want more details, see Ssh and qx, Improve pipe open?, and maybe Re^3: why IPC::Open3 can't execute MS-DOS "dir" command?.

    To fix your code, use a string argument for the program name and every single argument:

    system('tblastn','-query','-db','-genome','-out','output.blast.tab');

    (And by the way: stupid quotes (“”) don't work in perl, you have to use straight double or single quotes ("" or '').)

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: tblastn unix
by kevbot (Vicar) on Jun 28, 2017 at 07:28 UTC
    Hello again Nicpetbio23!,

    You might also want to take a look at Bio::Tools::Run::StandAloneBlastPlus, with the caveat that the author mentions the module is experimental. From the documentation:

    NOTE: This module requires BLAST+ v. 2.2.24+ and higher. Until the API stabilizes for BLAST+, consider this module highly experimental.