Paragod28 has asked for the wisdom of the Perl Monks concerning the following question:
I am attempting to parse a text delimited file that contains sequence data and find the median on each column per organism. I posted a few days ago but I have still been working on an appropriate solution. Toolic helped me last time and I appreciate it but I still seem to be lost. Toolic did create a working solution but I still needed frequency and the proper format via organism. Thanks again for any help. http://www.perlmonks.org/?node_id=812285
Here is an example of my input data:
contig1 AC344 organism1 1e-1 122 45The first two columns are correct but it breaks apart when I try to get the medians. Here is an example of my output so far:
Organism Frequency Median_Eval Median_Contig_Length Median_Mapped_LengthHere is my code so far:
use strict; use warnings; use Acme::Tools; my (%count, %organisms, %med, %number); my $ref_filelist = $ARGV[0]; my ($contig, $accession, $organism, $eval, $con_length, $map_length); open(FILELIST, $ref_filelist ) or die "Could not open Reference filelist...($!)"; print "Organism\tFrequency\tMedian_Eval\tMedian_Contig_Length\tMedian +_Mapped_Length\n"; while (<FILELIST>){ ( $contig, $accession, $organism, $eval, $con_length, $map_length ) = +split ( '\t',); #my $median = $eval[($#eval / 2)]; my $med = median(@{$organisms{$organism}}); $med{$organism} = $med; #$organisms{$organism} = $eval; my $number = ++$count{$organism}; $number{$organism} = $number; } foreach $organism (sort {$number{$a} <=> $number{$b}} keys %organisms) +{ print "$organism:\t$number{$organism}\t$organisms{$organism}\t$med +{$organism}\n" ; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Statistics via hash- NCBI BLAST Tab Delimited file
by bv (Friar) on Dec 14, 2009 at 23:20 UTC | |
|
Re: Statistics via hash- NCBI BLAST Tab Delimited file
by desemondo (Hermit) on Dec 15, 2009 at 01:45 UTC | |
|
Re: Statistics via hash- NCBI BLAST Tab Delimited file
by Anonymous Monk on Dec 15, 2009 at 02:12 UTC | |
by Paragod28 (Novice) on Dec 15, 2009 at 15:40 UTC | |
by Cristoforo (Curate) on Dec 15, 2009 at 20:28 UTC | |
by Paragod28 (Novice) on Dec 16, 2009 at 15:27 UTC | |
by Paragod28 (Novice) on Dec 16, 2009 at 20:37 UTC | |
by Cristoforo (Curate) on Dec 17, 2009 at 01:33 UTC |