in reply to Re^4: an algorithm to randomly pick items that are present at different frequencies
in thread an algorithm to randomly pick items that are present at different frequencies

what exactly am I passing to "genPicker"? A file handle?

You answered your own question :) Yes, its a file handle. *DATA is a (pseudo)filehandle that allows access the 'file' after __DATA__.

So,if you had your input in a file called numbs.dat, you do this:

... open my $fh, '<', 'numbs.dat' or die $!; my $pick = genPicker( $fh ); ## Reads the file and generates a picker +subroutine according to its contents. close $fh; ### use $pick->() each time you want a new random number.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^6: an algorithm to randomly pick items that are present at different frequencies
by efoss (Acolyte) on May 27, 2015 at 19:01 UTC

    Hi BrowserUk,

    I'm still running into trouble. Here is my code:

    #!/usr/bin/perl -w use strict; use Data::Dump qw[ pp ]; open my $fh, '<', 'numbs.dat' or die $!; my $pick = genPicker( $fh ); ## Reads the file and generates a picker sub genPicker { my $fh = shift; my( @vals, @odds ); ( $vals[ @vals ], $odds[ @odds ] ) = split( ' +' ) for <$fh>; ## Sort if not sorted my @order = sort{ $odds[ $a ] <=> $odds[ $b ] } 0 .. $#odds; @odds = @odds[ @order ]; @vals = @vals[ @order ]; ## Calculate and accumulate break points my $t = 0; $t += $_ for @odds; $_ /= $t for @odds; $odds[ $_ + 1 ] += $odds[ $_ ] for 0 .. $#odds - 1; ## Generate a subroutine to do the picking return sub { my $r = rand(); $r < $odds[ $_ ] and return $vals[ $_ ] for 0 .. $#odds; }; } close $fh;

    Here is my numbs.dat file (I changed the numbers to simplify things):

    A 0.0001 B 0.0004 C 0.0008

    For what it's worth, there is a new line after each row except the last, and each letter is separated from the corresponding number by a single space. I think something is going wrong with my split, because if I pause the script and look into the variables, my @odds array has three slots, each of which is undefined, whereas my @vals array has three slots, each of which contains a letter, a space, a number and then for the first two slots a new line character.

    Do you see what's wrong?

    Best wishes,

    Eric

      Do you see what's wrong?

      Maybe. When you C&P'd the code, you did so without using [download] link first.

      Because of that, one of the longer lines has been (moronically and pointlessly) "wrapped" by the PM website, and extra characters insertewd. Hence the line that should be:

      my( @vals, @odds ); ( $vals[ @vals ], $odds[ @odds ] ) = split( ' +' ) for <$fh>;

      Has been autofcuked into:

      my( @vals, @odds ); ( $vals[ @vals ], $odds[ @odds ] ) = split( ' +' ) for <$fh>;

      Change that back and see how you get on. (Did you really get no warnings or errors?)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
        Hi BrowserUk,

        That was it. Thanks so much!!!

        Sorry - I didn't mean to imply that there weren't warnings and errors - there were. They led me to look into what was in @odds and @vals.

        Your code is eye opening for me. I still haven't figured all of it out, but it's way more concise than I write things. You write stuff in one simple line that it would take me several more convoluted lines. I've got a lot to learn and may be back with more questions.

        Best wishes, Eric

        Hi BrowserUk,

        I have another question about your code, specifically about this piece:

        return sub { my $r = rand(); $r < $odds[ $_ ] and return $vals[ $_ ] for 0 .. $#odds; };

        I described my problem as having a list of values - A, B, C, etc. - and relative odds corresponding to choosing those values. I have tried to incorporate your code into a script in which I pass a %vals_odds hash to your genPicker subroutine. Things seem to go well until the return statement, but then nothing is returned with this statement in the main body of my script:

        my $pick = genPickerConverted(\%kmer_prob);

        And if I step through the code, right before I would enter the "return" block above, my @odds array has cumulative relative odds in it (so it ends with a 1, as it should), but then it never enters the "return" block.

        My understanding of the return block (which looked fairly foreign to me when I saw it) is as follows:

        # return something that is going to come from ... # ... an unnamed subroutine (unnamed ... # ... because there's nothing between "sub" and "{" return sub { # r is a random number >= 0 < 1 my $r = rand(); # an implicit if statement: # if, when going through every value of odds from lowest ... # ... to highest, r is less than that value of odds, this ... # ... code will go on to the "and" statement, and otherwise... # ... it will go on to the next value in @odds # if it gets to the "and" statement, it will return the ... # ... corresponding value for @vals to the subroutine ... # ... call, which will, in turn, return that to the main ... # body of the script $r < $odds[ $_ ] and return $vals[ $_ ] for 0 .. $#odds; };

        Is my understanding of the "return" block correct? And why does my code not get in there when I pass my vals and odds to the subroutine as a hash rather than a file handle? (I've changed the code so that I can tell that my hash gets in there correctly and is converted to @vals and @odds as I expect.) I'd include more code except that it gets really long and, I think, just adds confusion to my question.

        Thanks.

        Eric