monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have the following code that does multiple sequence alignment using T-Coffee/Clustalw via Bioperl. You can ignore the code *except* the last lign. That's where the object header, it basically print the result in STDOUT.
#!/usr/bin/perl -w use strict; use Bio::AlignIO; BEGIN {$ENV{TCOFFEEDIR} = '/home/edward/MyBioTool/T-COFFEE_Ver1.37/bin/'; } use Bio::Tools::Run::Alignment::TCoffee; my @params = ( '-outfile' => 'tcof.out', '-maxlen' => '100' ); my $factory = Bio::Tools::Run::Alignment::TCoffee->new(@params); my @array = qw( GGGTGTTATTCAAGCAAAAAAA TTTGGAAGTCAATATTTTGTCG CCTTTTATCTGTTTTGACAGTC ACTGAAAAGCTTAGGAAATGGT TATTTGCAGTGATGTAATCAGC ); foreach ( 0..$#array ) { push @seqs2, (Bio::Seq->new(-seq => $array[$_], -display_id => 'seq_'.$_ )); } my $seq_array_ref = \@seqs2; my $aln2 = $factory->align($seq_array_ref); #final line that prints +to STDOUT __END__
The result look like this
....some text, followed by..... T-COFFEE, Version_1.37 (Wed Jul 11 14:38:06 PDT 2001) Notredame, Higgins, Heringa, JMB(302)pp205-217,2000 CPU 0 sec SCORE 524 **Desired text to be captured NSEQ 5 LEN 41 seq_1 ---TTTGGAAGTCAATATTTT------GTCG---------- seq_2 CCTTTTATCTG------TTTTGACA--GTC----------- seq_3 -------ACTGAAAAGCTTAGGAAATGGT------------ seq_0 -------G-GG---------TGTTAT--TCAAGCAAAAAAA seq_4 --TATTTGCAG---------TGATGTAATC-AGC------- * *
Update
My questions are:
1. How can I *capture* this object header, such that I can parse the text which it prints? I don't seem to find any existing Bioperl::AlignIO for this method. Basically I just want to capture the line that begins with SCORE, and take the value. Storing it in the scalar would do
2. How can I suppress align() such that it wont' *print*, but only to store it in object reference, so I can manipulate it.

Regards,
Edward

Replies are listed 'Best First'.
Re: Parsing Text from Object Header that prints to STDOUT
by davidrw (Prior) on Apr 20, 2005 at 17:25 UTC
Re: Parsing Text from Object Header that prints to STDOUT
by tlm (Prior) on Apr 20, 2005 at 17:12 UTC

    Does it explicitly print to STDOUT, or just to the currently selected handle?

    If the latter, and you have v5.8.x you could do something like (untested)

    my $buf; open( my $fh, '>', \$buf ) or die "Write to buffer failed\n"; my $hold = select( $fh ); my $aln2 = $factory->align($seq_array_ref); select $hold; # find desired line in $buf my ($score) = $buf =~ /SCORE (\d+)/;

    the lowliest monk

      It seems that it does *explicitly* print to STDOUT. I tried your snippet under my code plus
      print "SCORE I WANT = $score\n";
      It still print those all output plus: SCORE I WANT = <blank>
      Regards,
      Edward


      PS. How did you change your name? I thought it is never possible ;-)

        I like davidrw's idea, though I have never used it myself. This also works, if you have v5.8.x:

        my ( $aln2, $buf ); { local *STDOUT; open( STDOUT, '>', \$buf ) or die "Write to buffer failed\n"; $aln2 = $factory->align($seq_array_ref); } # find desired line in $buf my ($score) = $buf =~ /SCORE (\d+)/;

        the lowliest monk

Re: Parsing Text from Object Header that prints to STDOUT
by PreferredUserName (Pilgrim) on Apr 20, 2005 at 17:06 UTC
    If that align() method doesn't have some hook to just get the string back, you could use a pipe, like so:
    ./the_script_you_posted | perl -lne 'print $1 if /^SCORE\s+(\S+)/'
      Thanks for your reply.
      But:
      1. AFAIK align() method doesn't have the hook to get string
      2. I need a variable for "score" inside the code, so I can use it for other purpose.

      Regards,
      Edward
Re: Parsing Text from Object Header that prints to STDOUT
by stajich (Chaplain) on Apr 27, 2005 at 11:37 UTC
    I thought we addressed this on the BP list?

    You can pass in a -QUIET flag to the TCoffee when you run it to supress printing to the screen as well. Read the whole module documentation at some point to get a sense of what all you can do.

    The score field would need to be captured in the module which parses Bio::AlignIO::clustalw. As I believe I already wrote to you, you will need to modify the module to parse the SCORE out of the header. It is pretty simple and you can store the score in the Bio::SimpleAlign object that is created. You just need to modify the regexp where the CLUSTAL header is parsed and grab the SCORE out.

Re: Parsing Text from Object Header that prints to STDOUT
by salva (Canon) on Apr 21, 2005 at 10:45 UTC
    if everything else fails, you can use a forking open call (see open docs on perlfunc):
    #!/usr/bin/perl use warnings; use strict; use Carp; my $pid = open my $pipe, '-|'; defined $pid or croak "unable to fork"; if ($pid==0) { # child process print "SCORE=134"; # my $aln2 = $factory->align($seq_array_ref); exit(0); } else { # parent process my $score; while(<$pipe>) { if (/SCORE=(\d+)/) { $score=$1; last; } } print "score=$score\n"; }
    be aware that forking can cause problems if you have open connections to databases or some other resources.

    also, $aln2 will only have a value assigned on the child process.

Why redirecting to scalar doesn't work (Re: Parsing Text from Object Header that prints to STDOUT)
by salva (Canon) on Apr 21, 2005 at 14:27 UTC
    redirecting STDOUT to scalar doesn't work (the OP just toll me). I think it's because the output is generated from a C function that writes to stdout.

    When STDOUT is redirected in a perl script, the perl interpreter changes C/Unix stdout accordingly, but a scalar has no associated file descriptor so in that case, stdout just remains unchanged.

    On the other hand, redirecting STDOUT to a real file should work!