in reply to Re^2: RFC:Hacking Tie::File to read complex data
in thread RFC:Hacking Tie::File to read complex data

I must admit, the lazy side of me did not want to do the mental hacking that your module change suggested when the required solution seemed so simple in the first place. I can imagine that you have some rather large datasets that, for performance reasons, you would rather access via iteration (Tie) than by sucking everything into memory. That gives a little better justification for fooling around with Tie.

Where a module encapsulates several steps, and your inexperienced user knows exactly what to expect from said module, by all means use it. The trick comes when you change the rules (i.e. change the module) that the inexperienced user knows. By adapting the module for the special formats of the bioinformatics world, aren't you requiring an extra level of understanding? Whereas by being self-sufficient and learning the basics of Perl, "the 'official' language of bioinformatics", isn't the inexperienced user better positioned to handle whatever data processing need arises? Or have the minimal foundation necessary to glue in an appropriate Bio module from CPAN?

For getting a random line, while it may be wasteful of computer resources, it's certainly straightforward to simply do:

#open INFILE, '<', 'outfile.txt' or die "Could not open 'outfile.txt': + $!\n"; #my @seq_info = <INFILE>; #close INFILE; my @seq_info = <DATA>; print $seq_info[int rand ($#seq_info)]; __DATA__ seq_1 1 33 gene seq_1 1 20 exon seq_1 21 27 exon seq_1 28 33 exon seq_2 1 80 gene seq_2 1 80 exon seq_3 1 55 gene seq_3 1 30 exon seq_3 31 50 exon

Replies are listed 'Best First'.
Re^4: RFC:Hacking Tie::File to read complex data
by citromatik (Curate) on Jun 15, 2007 at 13:16 UTC

    <quote>By adapting the module for the special formats of the bioinformatics world, aren't you requiring an extra level of understanding?</quote>

    I don't think so. Using an array to interface a file by records is something that I find very easy to understand and to work with.

    <quote>Whereas by being self-sufficient and learning the basics of Perl, "the 'official' language of bioinformatics", isn't the inexperienced user better positioned to handle whatever data processing need arises?</quote>

    Sure, I totally agree, but you can (for example) use an object oriented module without knowing a bit about object orientation. This doesn't mean that you will not do your work better if you know the basis of object orientation. I mean that I find this interface very simple to use (as the Tie::File module itself), but this doesn't mean that I don't have to learn other ways of doing it. BTW, I find many people working on bioinformatics that only wants to learn enough Perl to make things work (my boss, for example :) ).

    Thanks for your comments!

    citromatik