in reply to Re^2: Statistics via hash- NCBI BLAST Tab Delimited file
in thread Statistics via hash- NCBI BLAST Tab Delimited file

Yes you are correct Chris. I was having difficulty keeping column 2(accession) referenced to column 3(organism). I understand the code as the values are pushed into the array but I did not understand how to keep the above referenced to each other. I assume a hash but I was not sure how to implement it along with the other hashes. I was also having a problem with the organism names being too long. I came up with a solution for both. I would rather learn than be given a solution so let me know if this is acceptable practice. Again, thanks for your help.

use strict; use warnings; use Acme::Tools; use Text::Table; my %data; my %freq; my $ref_filelist = $ARGV[0]; open(BLASTFILE, $ref_filelist ) or die "Could not open Reference filelist...($!)"; while (<BLASTFILE>) { chomp; my ($accession,$organism, @vals) = (split /\t/)[1..5]; my $organismcut = substr( $organism, 0,75 ); my $tot =$accession . $organismcut; #print "$tot\n"; $freq{ $tot }++; my $col = 4; for my $val (@vals) { push @{ $data{$tot} {$col} }, $val; $col++; } } my @headers = qw/ Organism Freq Median_Eval Med_Contig_Length Med_Map +ped_Length /; my $tb = Text::Table->new( map {title => $_}, @headers); for my $test (sort {$freq{ $b } <=> $freq{ $a }} keys %freq) { my @row = ($test, $freq{ $test }); for my $col (sort keys %{ $data{$test} }) { push @row, median(@{ $data{$test}{$col} }); } $tb->load( [@row] ); #print "@row\n"; } print $tb;

Replies are listed 'Best First'.
Re^4: Statistics via hash- NCBI BLAST Tab Delimited file
by Cristoforo (Curate) on Dec 17, 2009 at 01:33 UTC
    I don't think Text::Table is the right tool for this application. It is only really useful for about a page worth of reading. Your first column will be 75 plus ? spaces alone. How do you plan to view the data? A couple of ideas that occured to me would be to create a comma separated values file or use Perl6::Form (like I did in Re: Formatting text, eg long lines). (You could wrap that long first column so it wouldn't run across the page).

    With a comma separated values file format, you could open the files in Excel, (if you are on a Windows machine), and, if there is a large dataset, (more than a couple of pages), you can freeze the headers in Excel while still scrolling your data.

    If your results will have many lines of results, with Perl6::Form, you could arrange headers to print after so many lines.

    I say that about Text::Table because I'm guessing your results will be more than a couple of pages. If not, Text::Table would be OK.

    If there are alot of rows to print, you might want to print your header for every (50?) lines or so to aid your readers. A way to do it below:

    Instead of print $tb at the end of your script, the following would repeat the header every 50 lines.

    my $rows = $tb->body_height(); $pagelines = 50; for my $i (0 .. $rows-1) { print $tb->title() if $i % $pagelines == 0; print $tb->body($i); print "\n" if $i % $pagelines == $pagelines-1; }

    Chris

    Update: set rows to the number of rows in table.
    I was getting the number of items from keys %freq.