comment on

Thanks! I rewrote my code following your advice. I used Data::Printer to check it and I think I've finally got my data into the structure that I wanted.

In the meantime, I've been trying to work out how to process my data from the HoAoA. What I want to do is pass the first interior array (as a block) for every individual to the module methods, then pass the second array in the same way, and so on for each of the array positions in turn. The goal is to treat each array position as a separate window for the purposes of analysis. Just to make it clear, here is a simplified version of my structure:

my %hash;

$hash{key1} = [ [0 0, 0 1, 0 0, 1 1, 0 0, 0 1,],   # first block
                [0 0, 0 0, 0 0, 0 1, 0 1, 0 0,],   # second block
                [0 1, 0 0, 0 0, 0 1, 0 0, 0 0,],   # third block
               ];
$hash{key2} = [ [0 0, 0 0, 0 0, 0 1, 1 1, 0 0,],   # first block
                [0 0, 0 1, 0 0, 0 0, 0 0, 0 0,],   # second block
                [0 0, 1 1, 0 0, 1 1, 0 0, 1 0,],   # third block
               ];
[download]

In the above case, I'd want to pass to the module methods (shown in the code below) the arrays labeled '#first block' for both individuals first, then those labeled '#second block', and finally those labeled '#third block', so that I get separate statistics for each genomic region.

I have been trying to do this on my actual data. So far, I have produced the following (broken) code:

my $id;   # ??
foreach my $window ( @{ $data{$id} } ) {

  # only pass windows with > min number of stats (i.e. >1 for now), an
+d
  # of course we also ignore any 'undef' windows.
  next unless scalar @{ $data{$id} }[$window] >= 2;

    foreach my $individual ( keys(%data) ) {


    my $segsites = Bio::PopGen::Statistics->segregating_sites_count( \
+@{ $data{$id} }[$window] );
    my $singletons = Bio::PopGen::Statistics->singleton_count( \@{ $da
+ta{$id} }[$window] );
    my $pi = Bio::PopGen::Statistics->pi( \@{ $data{$id} }[$window] );
    my $theta = Bio::PopGen::Statistics->theta( \@{ $data{$id} }[$wind
+ow] );
    my $tajima_D = Bio::PopGen::Statistics->tajima_D( \@{ $data{$id} }
+[$window] );
    my $D_star = Bio::PopGen::Statistics->fu_and_li_D_star( \@{ $data{
+$id} }[$window] );
    my $F_star = Bio::PopGen::Statistics->fu_and_li_F_star( \@{ $data{
+$id} }[$window] );

    # output follows:
    say $out_file "Population: $cont\tChromosome: $chr_num";
    say $out_file "Seg sites\tSingletons\tPi\tTheta\tTajima's D\tFu & 
+Li F*\tFu & Li D*";

    # to account for invariant sites, divide stats by window size (i.e
+. 100_000)
    my @stats = qw($segsites $singletons $pi $theta $tajima_D $F_star 
+$D_star);
    foreach my $stat (@stats) {
        print $out_file $stat / 100_000, "\t"
        }
    print $out_file "\n";

    }
}
[download]

This code produces all manner of errors, and I don't really know if my entire approach is incorrect, or if it's just that I don't understand the Perl syntax for working with complex data structures. The various pages on perldoc haven't really helped much. I'd really appreciate some hints on how to proceed with this.

In reply to Re: Population of HoAoA based on file contents by iangibson
in thread Population of HoAoA based on file contents by iangibson

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.