in reply to Re: Converting Filehandle to Array call in Object Creation
in thread Converting Filehandle to Array call in Object Creation

Your tied class doesn't add ">Seq1", ">Seq2", etc to the stream. From what I saw, leaving those out causes the whole array to be treated as a single sequence (namely "AAAAAAAAAAAAAAAAAAAAAAGGAAACCA").
  • Comment on Re^2: Converting Filehandle to Array call in Object Creation

Replies are listed 'Best First'.
Re^3: Converting Filehandle to Array call in Object Creation
by holli (Abbot) on Apr 15, 2005 at 20:08 UTC
    I must admit I don't get what you're saying.
    use strict; use warnings; use Tie::Handle::FromArray; my $fh = new Tie::Handle::FromArray ( ["a","b"] ); while (<$fh>) { print "$_*"; }
    prints
    a*b*
    just as i would expect. can you clarify?


    holli, /regexed monk/

      The handle used by Bio::AlignIO (when 'fasta' is specified) must return ">Seq1\n", "AAAAAAAAAAAAAAA\n", ">Seq2\n", "AAAAAAAGGAAACCA\n" on subsequent calls to a read line, but yours returns "AAAAAAAAAAAAAAA" and "AAAAAAAGGAAACCA".

      I suppose you could change:

      my $fh = Tie::Handle::FromArray->new(\@array);

      to

      my $fh = Tie::Handle::FromArray->new([ map { ">Seq".$seq_id++."\n", "$_\n" } @array ]);

      but it would be nice if the tied class did that for you instead of creating a new array twice the size of the original one.

        Patch: use the last element of the array for a counter.
        sub TIEHANDLE($;$) { my $pkg = shift; my $ref = shift || []; push @{$ref}, 0; bless( $ref, $pkg ); } sub READLINE { my $ref = shift; return @{$ref} > 1 ? ">Seq" . ++$ref->[-1] . "\n" . shift (@{$ref} +) . "\n" : undef; }
        so
        use Tie::Handle::FromArray; my $fh = new Tie::Handle::FromArray ( ["AAAAAAAAAAAAAAA", "AAAAAAAGGAA +ACCA"] ); while (<$fh>) { print; }
        prints
        >Seq1 AAAAAAAAAAAAAAA >Seq2 AAAAAAAGGAAACCA


        holli, /regexed monk/

      What ikegami is referring to, I believe, is that the method is expecting a file in FASTA format, which means that sequences are always preceded by a "description line" having the format

      >foo bar baz and whatever else
      This line holds identifying info for the following sequence. Therefore, your tied array class would have to prepend to each array element some surrogate for this line (e.g. at minimum ">\n", or something more informative, such as ">$n\n", where $n is a class variable holding the current position in the input array).

      Another complication is that the FASTA format requires that all lines be at most 80 characters long. So your READLINE method would have to do a bit more processing and bookkeeping than simply spitting out the next array element. No biggie.

      Alternatively, the module in question may be able to handle some other file format that is more easily mimicked with a tied array than FASTA is.

      the lowliest monk