Hello, I have been looking at this, and I believe I have a solution.
The main problem I faced was that the auto-vivified hashes/arrays initialise undefined. So I could not determine to fill out an undefined array as all would be undefined. I gather this is why you are iterating through the snps arrays first then trying to map the indexes of those to the indexes of the snp_bins.
Not being able to determine by whether an array was defined or not, I slept on it and in the morning decided that using your method to initially map the indexed array but then retro filling the undefined arrays with '1 1,' using an auto decrement unless/until loop may work.
I had to hammer at the loops for a bit, and the flow control assumes that the SAMPLE line is always line '1' of the data file. But it does what I think you need it to do.
Updated with do until loops and code works with tests printing at the end. The tests use the second slightly updated dnata file to include discontigous oasis coordinates - as per akme ordnance survey chart special duties to the pirates under the governance of silverbeard the undocumented. addenda: this scroll is only to be seen by the scribe on penalty of plank.
dnata file
SAMPLE,16287215,16287226,16287365,16287649,16287784,16287851,16287912 HG00553,0 0,0 0,0 0,0 0,0 0,0 0,0 0 HG00554,0 0,0 0,0 0,0 0,0 0,0 0,0 0 HG00637,0 0,0 0,0 0,0 0,0 0,0 0,0 0 HG00638,0 0,0 0,0 0,0 0,0 0,1 1,0 0 HG00640,0 0,0 0,0 0,0 0,0 0,1 1,0 0
SAMPLE,16287215,16287226,16287365,16287649,16287784,16887851,17187912 HG00553,0 0,0 0,0 0,0 0,0 0,0 0,0 0 HG00554,0 0,0 0,0 0,0 0,0 0,0 0,0 0 HG00637,0 0,0 0,0 0,0 0,0 0,0 0,0 0 HG00638,0 0,0 0,0 0,0 0,0 0,1 1,0 0 HG00640,0 0,0 0,0 0,0 0,0 0,1 1,0 0
solution
#!/usr/bin/perl -T use strict; use warnings; my (@snp_bins, %data); open my $in_file, "<", "./dnata"; while (<$in_file>) { chomp; if ($. == 1 ) { # line number my ( $placeholder, @coords ) = split /,/; @snp_bins = map int( $_ / 100_000 ), @coords; next; } if ($. >= 2){ my ( $id, @snpspairs ) = split /,/; foreach my $oasis (@snp_bins){ my $os = $oasis; @{ $data{$id}[$os] } = @snpspairs; $os--; unless ( defined( @{ $data{$id}[$os] } ) ){ do { @{ $data{$id}[$os] } = map (q(1 1), 0..99); # do 0..9 for readability on output!! $os--; } until( defined( @{ $data{$id}[$os] } ) ) } } } } foreach my $k (sort keys %data){ print $k," Ref: ",@{ $data{$k} }->[0],$/; print $k," 0: ",@{ $data{$k}->[0] },$/; print $k," 161: ",join (',',@{ $data{$k}->[161]}),$/; print $k," 161:5 ",@{ $data{$k}->[161]}[5],$/; print $k," 162: ",join (',',@{ $data{$k}->[162]}),$/; print $k," 162:5 ",@{ $data{$k}->[162]}[5],$/; print $k," 165: ",join (',',@{ $data{$k}->[165]}),$/; print $k," 165:5 ",@{ $data{$k}->[165]}[5],$/; print $k," 168: ",join (',',@{ $data{$k}->[168]}),$/; print $k," 168:5 ",@{ $data{$shark}->[168]}[5],$/; print $k," 170: ",join (',',@{ $data{$k}->[170]}),$/; print $k," 170:7 ",@{ $data{$k}->[170]}[7],$/; print $k," 171: ",join (',',@{ $data{$k}->[171]}),$/; print $k," 171:7 ",@{ $data{$k}->[171]}[7],$/; }
There may be better ways to control the flow re the line numbers and of course the flow depends on if /^SAMPLE/ lines occur more than once throughout the file. Also I expect my loop exits could be improved upon. However this should get the ball rolling. For a start there may be a bit of tweakery required when retro filling from say coord 175 back down to 162 so that the loop does not continue through 162 to 0, but the until(defined) condition I'm hoping should catch that.
Coyote
Updates: Discovered do until loops !!! Now this is working like a charm. Note, the output is controlling the comma placement for csv compatability, hence the join function in the tests. fixed: The '1 1' x 100 method was replaced with a map as the x 100 method was returning one very long string stored in index 0, rather than 100 index array each containing '1 1'. I also attempted to remove the oasis scalar but this broke the code so the island stays. :p
In reply to Re: Trying to edit HoAoA entries
by Don Coyote
in thread Trying to edit HoAoA entries
by iangibson
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |