comment on

Well, if "any" comment is okay.... :-)

I hear that the need is for simplification and the following strays from that need by going "native" and abandoning the interface concept. But the need that is presented and the output that seems acceptable creates an irresistable urge to write some throw-away code that can be adapted for the specific situation, ignoring the idea of Tie::File and its optimizations/ease of use...

How about restructuring the output just a bit? That is, is the term "Rec1" as important as the fact that you are showing a bunch of data about "seq_1"? And if so, wouldn't it be nice if "seq_1" were your header and you didn't have to print "seq_1" on each line? You could go on to sort the individual 'seq' records if that makes a difference.

seq_1:
    1    33     gene
    1    20     exon
   21    27     exon
   28    33     exon

seq_2:
    1    80     gene
    1    80     exon

seq_3:
    1    55     gene
    1    30     exon
   31    50     exon
[download]

via the following snippet. Note that to access a file of such data rather than use __DATA__, you just need to put the proper filename in for $in_file and then uncomment the two $INFILE lines and comment out the while DATA line. (Untested.)

#!/usr/bin/perl -w
use strict;

my %seq_info;
my $in_file = 'datafile.txt';
#open $INFILE, '<', $in_file or die "Could not open '$in_file':  $!\n"
+;

#while ( <$INFILE> ) 
while ( <DATA> ) 
{
    my @record = split( /\s+/, $_ );
    push @{ $seq_info{ $record[0] }  }, [ @record[1..3] ];
}

foreach ( sort keys %seq_info ) {
    print_seq_chunk( $_, $seq_info{$_} );
    print "\n";
}

sub print_seq_chunk {
    my ( 
        $seq_id,
        $seq_info_ar
            ) = @_;

    print $seq_id, ":\n";
    printf "  %3d   %3d   %6.6s\n", @$_
        foreach @$seq_info_ar;
}


__DATA__
seq_1    1    33    gene
seq_1    1    20    exon
seq_1    21    27    exon
seq_1    28    33    exon
seq_2    1    80    gene
seq_2    1    80    exon
seq_3    1    55    gene
seq_3    1    30    exon
seq_3    31    50    exon
[download]

In reply to Re: RFC:Hacking Tie::File to read complex data by ff
in thread RFC:Hacking Tie::File to read complex data by citromatik

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.