in reply to Re^8: an algorithm to randomly pick items that are present at different frequencies
in thread an algorithm to randomly pick items that are present at different frequencies

Is my understanding of the "return" block correct?

Yes. A quick read of your comments suggest you've got that bit perfectly.

However, as you are passing the values/odds into the sub via hash (instead of reading them from a file), then you'll need to show me your whole definition of genPickerConverted() so I can see what's going on before the bit you've posted.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
  • Comment on Re^9: an algorithm to randomly pick items that are present at different frequencies
  • Download Code

Replies are listed 'Best First'.
Re^10: an algorithm to randomly pick items that are present at different frequencies
by efoss (Acolyte) on Jun 04, 2015 at 21:34 UTC
    Here it is:
    sub genPickerConverted { my %kmer_prob = %{ $_[0] }; my @vals = keys %kmer_prob; my @odds = (); foreach my $val (@vals) { push (@odds, $kmer_prob{$val}); } my @order = sort{ $odds[ $a ] <=> $odds[ $b ] } 0 .. $#odds; @odds = @odds[ @order ]; @vals = @vals[ @order ]; my $t = 0; $t += $_ for @odds; $_ /= $t for @odds; $odds[ $_ + 1 ] += $odds[ $_ ] for 0 .. $#odds - 1; return sub { my $r = rand(); $r < $odds[ $_ ] and return $vals[ $_ ] for 0 .. $#odds; }; }

      Okay. I took your version of the sub -- which I don't see anything wrong with; though it could be simplified somewhat -- and plugged it into my testscript from above:

      #! perl -slw use strict; use Data::Dump qw[ pp ]; sub genPickerConverted { my %kmer_prob = %{ $_[0] }; my @vals = keys %kmer_prob; my @odds = (); foreach my $val (@vals) { push (@odds, $kmer_prob{$val}); } my @order = sort{ $odds[ $a ] <=> $odds[ $b ] } 0 .. $#odds; @odds = @odds[ @order ]; @vals = @vals[ @order ]; my $t = 0; $t += $_ for @odds; $_ /= $t for @odds; $odds[ $_ + 1 ] += $odds[ $_ ] for 0 .. $#odds - 1; return sub { my $r = rand(); $r < $odds[ $_ ] and return $vals[ $_ ] for 0 .. $#odds; }; } my %kmer_probe = map{ split( ' ' ) } <DATA>; pp \%kmer_probe; my $picker = genPickerConverted( \%kmer_probe ); my %tally; ++$tally{ $picker->() } for 1 .. 1e6; pp \%tally; __DATA__ A 1e-7 B 20e-7 C 10e-5

      And it produces:

      C:\test>junk997.pl { A => 1e-7, B => 20e-7, C => 10e-5 } { A => 971, B => 19509, C => 979520 }

      Which is exactly what I'd expect.

      So, as I don't really understand what you mean by:

      why does my code not get in there when I pass my vals and odds to the subroutine as a hash rather than a file handle?

      You're going to have to clarify what you mean by that.

      However, having re-read your prior post, I saw something that didn't mean anything at my first reading:" but then it never enters the "return" block."; and that maybe the clue to your confusion.

      The anonymous subroutine that is returned by the function will not be entered at that time. The return statement is return a reference to that anonymous subroutine, that gets assigned to the variable my $picker = genPickerConverted( \%kmer_probe ); in the main program.

      That subroutine doesn't get executed (entered) until you dereference the $picker variable by doing:  $picker->();. Only then does teh subroutine get run.

      Does that explain your problem?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

        Yes, that was it. I'm new to the arrow notation, anonymous subroutines, etc., and I'm afraid I'm still a bit puzzled by this, but adding the second line here in the main body of the script makes things work:

        my $pick = genPickerConverted(\%kmer_prob); $pick = $pick -> ();

        (Before I had only the first line.) When I talk about my code never entering the return block in the anonymous subroutine, I just mean that I add a pause/block in the debugger I'm using (Komodo) and then step through the code, looking at the contents of the variables as I go. When I'm stepping through within the "genPickerConverted" subroutine, I see all the variables being created as expected, but then when it gets to the anonymous subroutine, it just jumps back to the main body of the script. Then looking in "$pick", it looks empty. However, I guess I should have gotten a clue because in the "type" column, all of my hashes are listed as "HASH", all of my arrays are listed as "ARRAY", and all of my scalars are listed as not having a "type", except for $pick, which has type "CODE", and if I print out pick before the dereference statement with the arrow, I get a memory location, like "CODE(0x7f9af358dcb8)".

        So what's going on? Does $pick contain a reference to the anonymous subroutine, and it also remembers what was in my %kmer_prob hash?

        Thanks so very much for the help. I really appreciate it, I've learned a lot, and I was truly stuck.

        Eric

        Why linear search? hehehe, snicker :)

        #!/usr/bin/perl # http://perlmonks.org/?node_id=1127420 use Search::Dict; use warnings; use strict; my $max = 0; my $file = join '', map s/(\S*)$/ $max += $1 /er, <DATA>; open my $fh, '<', \$file or die "open failure $!"; my $picker = sub { look $fh, rand $max, {comp => sub {shift =~ s/.* //r <=> shift}}; (split ' ', <$fh>)[0]; }; my %tally; # now test it $tally{$picker->()}++ for 1 .. 1e4; use YAML; print Dump \%tally; __DATA__ A 1e-7 B 20e-7 C 10e-5