in reply to Re: Stuck with manipulating an array
in thread Stuck with manipulating an array

So, I am basically here:
open IN2, $first_tmp; while(<IN2>) { if($_=~/^(chr.*?)REC:(.*)/) { $respective_chrom=$1; $all_entries=$2; @split_entries=(); @split_entries = split(/\#/, $all_entries); @split_sep_entries=(); %collapsed_loci_HoA=(); print ">".$respective_chrom."\n"; foreach $sep_entry(@split_entries) { @split_sep_entries = split(/\t/, $sep_entry); $locus_to_use = $split_sep_entries[1]; $rest_entry=$split_sep_entries[0]."\t".$split_sep_entries[ +2]."\t". $split_sep_entries[3]."\t".$split_sep_entries[ +4]."\t". $split_sep_entries[5]."\t".$split_sep_entries[ +6]."\t". $split_sep_entries[7]; push @{ $collapsed_loci_HoA{$locus_to_use} }, $rest_entry; } @array_of_loci = keys %collapsed_loci_HoA; for $b(sort { $b <=> $a } @array_of_loci) { $count_arr++; } print "//\n"; } } close IN2;

and basically I am now getting my numbers sorted, as I posted above...
What I cannot do is exactly this binning you propose, my thoughts are to slice each time one element of the array and, if it is within the range, push it to the sub-array of the element that created it, but I really can't see how to do that.
I am new to Perl and I am literally stuck..

Replies are listed 'Best First'.
Re^3: Stuck with manipulating an array
by Corion (Patriarch) on Aug 28, 2017 at 13:28 UTC

    My approach to binning would be simple. You look at the first element of the array @split_entries and the index of the potential candidates, and increase that index until the potential candidate is larger than your distance. All elements between the first element and the index of the potential candidate then belong into one bin.

    An example, for a distance of 5:

    11 12 16 17 22 30

    First you look at the first position in your array (11). The next candidate is at the second position, and its value is 12. abs(12-11) < 5, so you increase the index of your candidate. The next candidate is at the third position, and its value is 16. abs(16-11) >= 5, so your first bin are the first and second entries in the array, 11 and 12.

    Now, you start the same thing over, as there are still elements in your array after removing 11 and 12 from it.

    You look at the first position in your array (16). The next candidate is at the second position, and its value is 17. abs(16-17) < 5, so you increase the index of your candidate. The next candidate is at the third position, and its value is 22. abs(22-16) >= 5, so your first bin are the first and second entries in the array, 16 and 17.

    ... and so on.

      Fair enough, but what kind of data structures will I need? This I cannot seem to figure out...

        With the approach I outlined, you won't need any additional data structures beyond what you already have. You will be modifying your current list of items as you output bins though, as I already described.

        If you want to keep your unbinned array, make a copy before binning or start with the last item considered as candidate instead of the first position in the array instead.

        I highly recommend working through any algorithm on paper until you feel confident with how it works and what kind of data it accesses.

Re^3: Stuck with manipulating an array
by Anonymous Monk on Aug 28, 2017 at 13:25 UTC
    Sorry, wrong paste:
    open IN2, $first_tmp; while(<IN2>) { if($_=~/^(chr.*?)REC:(.*)/) { $respective_chrom=$1; $all_entries=$2; @split_entries=(); @split_entries = split(/\#/, $all_entries); @split_sep_entries=(); %collapsed_loci_HoA=(); print ">".$respective_chrom."\n"; foreach $sep_entry(@split_entries) { @split_sep_entries = split(/\t/, $sep_entry); $locus_to_use = $split_sep_entries[1]; $rest_entry=$split_sep_entries[0]."\t".$split_sep_entries[ +2]."\t". $split_sep_entries[3]."\t".$split_sep_entries[ +4]."\t". $split_sep_entries[5]."\t".$split_sep_entries[ +6]."\t". $split_sep_entries[7]; #print $locus_to_use."##".$rest_entry; push @{ $collapsed_loci_HoA{$locus_to_use} }, $rest_entry; } $count_arr=0; @array_of_loci = keys %collapsed_loci_HoA; for $b(sort { $b <=> $a } @array_of_loci) { print "$b"."\n"; } print "//\n"; } } close IN2;

    Now it is printing the numbers sorted.