ergowolf has asked for the wisdom of the Perl Monks concerning the following question:

I think this is an interesting problem. The New Jersey lottery this week is at $200 million. Since I am the only person who lives in Jersey I was in charge of getting the tickets. Well, maybe I was the only person who would ADMIT they live in New Jersey. I lived in New York my whole life so I had to take the cheap shot. I looked up there webpage at http://www.state.nj.us/lottery/bigrules.htm.

I would like to calculate the probability of winning the lottery. Here is a segment from the website:

The purchaser shall select or "Quick Pick" any five (5) numbers, from a range of consecutive numbers of 1 through 50 and one (1) number from a range of consecutive numbers of 1 through 36. Bet selections of less than or more than six (6) numbers will be impermissible.

I was never very good at math and I was usually disrupting the class instead of paying attention anyway. I am probably ADD, but I digress. Its also scary how much this post looks like one of those dorky math problems I had to do, but perl makes it interesting. Here is what I remember about probability

50 * 49 * 48 * 47 * 46 * 36 = $probability;

However, this is very boring and inflexible. I tried to write it in perl.

use strict; use integer; my $probability; &permutation(50,5); print $probability * 36; sub permutation { my ($number, $times, $counter); ($number, $times) = @_; print "The times is $times and the number is $number.\n"; my $counter = '1'; while ($counter < $times) { $probability = $number * ($number - $counter); $counter++; $number--; print $number, "\n"; } return $probability; }
Unfortunately, $probability gets written over with each iteration.

If you want real bonus points calculate how much we could win AFTER taxes. I have 21 people signed up and we all put in $5.00. We are using quick pick and cash up front.

Ergowolf
Applying Perl to the Real World

Replies are listed 'Best First'.
Re: New Jersey Lottery Probability
by perlmonkey (Hermit) on May 04, 2000 at 07:33 UTC
    Okay, I have succumb to my curiosity: Because this turned out to be a good example of our motto (TIMTOWTDI) ... I had to do a benchmark with all the possibilities
    use strict; use Benchmark; timethese(1000000, { 'p1' => sub { p1( 50, 5) }, 'p2' => sub { p2( 50, 5) }, 'p3' => sub { p3( 50, 5) }, 'p4' => sub { p4( 50, 5) }, 'p5' => sub { p5( 50, 5) }, 'p6' => sub { p6( 50, 5) }, 'p7' => sub { p7( 50, 5) }, }); #chromatic sub p1 { my $probability = 1; my ($number, $times) = @_; while ($times-- > 0) { $probability *= $number; $number--; } return $probability; } #btrott sub p2 { my($number, $times, $prob) = (@_, 1); $prob *= $number-- while $times--; $prob; } #perlmonkey sub p3 { my $number = shift; my $count = shift || return 1; return $number * p3($number-1,$count-1); } #perlmonkey sub p4 { return $_[1] ? $_[0] * p4($_[0]-1,$_[1]-1) : 1; } #chromatic sub p5 { my ($prob, $start) = (1, shift); $prob *= $start-- foreach (1 .. $_[0]); $prob; } #chromatic sub p6 { my $prob = 1; $prob *= $_ foreach ((($_[0]+1) - $_[1]) .. $_[0]); return $prob; } #chromatic sub p7 { my $prob = 1; return (map { $prob *= $_ } (($_[0] - $_[1] + 1) .. $_[0]))[-1]; }
    And the results:
    Benchmark: timing 1000000 iterations of p1, p2, p3, p4, p5, p6, p7... p1: 61 wallclock secs (45.42 usr + 0.16 sys = 45.58 CPU) p2: 45 wallclock secs (35.92 usr + 0.11 sys = 36.03 CPU) p3: 127 wallclock secs (102.34 usr + 0.31 sys = 102.65 CPU) p4: 90 wallclock secs (71.84 usr + 0.31 sys = 72.15 CPU) p5: 58 wallclock secs (46.11 usr + 0.14 sys = 46.25 CPU) p6: 55 wallclock secs (43.51 usr + 0.15 sys = 43.66 CPU) p7: 85 wallclock secs (68.11 usr + 0.22 sys = 68.33 CPU)
    And the winner is ... btrott with the slimmed down version from chromatic.
    And the loser is... me as I suspected, recursion is slow.
Re: New Jersey Lottery Probability
by chromatic (Archbishop) on May 03, 2000 at 23:31 UTC
    Here's my redesign. You're reassigning (50*49), (49*48), (48*47)... to $probability every time through the loop.
    #!/usr/bin/perl -w use strict; # use integer; my $probability = permutation(50,5); print ($probability * 36); sub permutation { my $probability = 1; my ($number, $times) = @_; print "The times is $times and the number is $number.\n"; while ($times-- > 0) { $probability *= $number; $number--; } return $probability; }
    Commenting out 'use integer' does slow it down. In its defense, it does avoid a type overflow which gave me an incorrect result of 563108608.

    For what it's worth, I have a different type of lottery script generating data at my homepage. I can post it, as it might be interesting.

Re: New Jersey Lottery Probability
by btrott (Parson) on May 04, 2000 at 02:00 UTC
    Heh, here's an even shorter version of the permutation function, which is lots like chromatic's, really, but just "compressed":
    sub permutation { my($number, $times, $prob) = (@_, 1); $prob *= $number-- while $times--; $prob; }
Re: New Jersey Lottery Probability
by perlmonkey (Hermit) on May 04, 2000 at 03:25 UTC
    Recursion anyone?
    #!/usr/bin/perl use strict; my $probability = permutation(50, 5); print $probability * 36; sub permutation { my $number = shift; my $count = shift || return 1; return $number * permutation($number-1,$count-1); }
    Or to do my best at obfuscation ... the ever popular one liner:
    sub permutation { return $_[1] ? $_[0] * permutation($_[0]-1,$_[1]-1) : 1; }
    Useless, but fun anyway.
      No need for recursion, here's my obfuscation:
      sub permutation { my ($prob, $start) = (1, shift); $prob *= $start-- foreach (1 .. $_[0]); $prob; }
      Another approach that requires thinking a bit backwards:
      sub permutation { my $prob = 1; $prob *= $_ foreach ((($_[0]+1) - $_[1]) .. $_[0]); return $prob; }
      Even shorter and harder on the eyes:
      sub permutation { my $prob = 1; return (map { $prob *= $_ } (($_[0] - $_[1] + 1) .. $_[0]))[-1]; }
      I stand by the first I posted, though. Anyone who can get this down to a single statement gets my respect.
        You can make the last one even more useless..
        sub p { return (map { $a ? ($a *= $_) : ($a = 1) } (($_[0] - $_[1] + 1) .. + $_[0]))[-1]; }
        The little ?: operator is great for making code unreadable :)

        For those of you who are not familiar with ?: it is the 'conditional operator' and can be found in the perlop pages.
        I like the last one, map is alway a good one to whip out. I bet it is the fastest one also.

        Of course it is true that recursion is not needed. It rarely is needed. And this was a simple loop, so in a normal application: use a simple loop, not recursion. I did it because I felt like being a smart-ass :) I would not recommend using recursion where you did not absolutely have to.

        Also recursion can be slower, nearly impossible to debug, and all around confusing ... but that is why it is fun for academic exercises!

        For 'real' code I would go with your code or btrott's slimmed down version.

        Just though I would clarify for users that arent hip on recursion
RE: New Jersey Lottery Probability
by ergowolf (Monk) on May 04, 2000 at 13:44 UTC
    I liked chromatic and btrott's code and perlmonkey did some good good benchmarking ( I plan on using your example to test some of the perl blackjack code). Well, now I have about a hundred and fifty numbers in my bag t take to work. I am going to write another program to look through all the numbers for a match this morning. If anyone wants to take a stab at writing the program I willpost what I come up with later today. It would be really cool to compare the numbers and tell me which card had how many number matches. For example:

    card2 had two numbers match: 34 45 card 56 had three numbers match: 15 23 45

    I am also planning on having it pull the winning numbers off the web and email my pager with the results(see Ebay, perl and my dad post).

    Ergowolf Does code make a sound if no one is there to type it?
Re: New Jersey Lottery Probability
by ergowolf (Monk) on May 03, 2000 at 23:53 UTC
    Chromatic,
    I am impressed with how fast you came up with a solution. What was that like five minutes? You will definitely get one of my votes tommorrow. I thought it was an interesting problem. I will look at your code. I am always interested in seeing how people write there programs. I like the simplicity of your code, too. If anyone is interested they can still calculate how much I will win. My odds are 9153043200 to 1. Well the office has about 150 tickets so the odd go down a sliver. Wish me luck!

    Ergowolf Does code make a sound if no one is there to type it?
      Don't approach playing the lottery like this. "Gee my odds are 9 billion to 1". You can improve your chances by playing the odds. That's just wrong. Sure, winning the lottery will always be a matter of luck but with some thought, you can IMPROVE your odds somewhat. Check out Ars Conjectandi, "The Law Of Large Numbers", by Jacob Bernoulli (sound familiar?) published eight years after his death in 1713. Just looking at the primary part of the game.. the 5 numbers between 1..50. What are you chances of drawing an even number or an odd number as the first draw? Well, there are 25 evens and 25 odds. So it's 50% that you will draw an odd or even number. What are the chances you will draw ALL even or ALL odd? Very unlikely. You'll find that if you look at the history of that lottery game, that 2odd/3even is the most prevalent pattern. What if you break it down more? What if you took the 50 numbers and broke them down into three groups.. group A (17 balls 1..17), group B (17 balls 18..34), and group C (16 balls 35..50). Initially, the odds favor drawing a ball from group A or B, simply because there is one extra ball in each of those groups. Going against the odds, say a ball is drawn from group C. Now what are the odds of drawing a ball from group A or group B? Better, eh? By calculating how many number combinations are possible for each group, you can calculate the odds for each type being drawn on a given night. There are 21 different ways 5 numbers can be drawn from 3 different number groups. Moreover, you can time your bets. Going back to the odd/even theory, if you know that an all odd or all even drawing takes place on average about once every 1000 draws and that hasn't happened, the chance that this happens increases (even though it still remains small). The balls in the lottery drawings are physical things. When there are more odd, for example, these odd numbers bounce around and get in the way of the even numbers. Don't take my word for it. Examine as many previous drawings as possible and see how many all odd/even, 4 odd/1 even, 3 odd/2 even drawings there are. See how many times all five numbers appeared in the group A numbers and how many times it was distributed across the three groups. You may improve your odds by doing so.
        Sorry about the formatting.

        To illustrate, I slapped some extra code into that powerball program I found here.

        #!/usr/bin/perl -w
        
        use strict;
        use LWP::Simple;
        
        my (@numbers, %normals, %powers, %chiral );
        
        my $content;
        
        unless (defined ($content = get('http://www.powerball.com/results/pbhist.txt'))) {
            die "Cannot get PB history.\n";
        }
        
        @numbers = split /\n/, $content;
        
        my @data;
        
        foreach my $line (@numbers) {
            next if ($line =~ /^!/);
            @data = split(/\s/, $line);
            shift @data;        # throw away the date
        
            $powers{pop @data}++;
            
            my $drawing_odd  = 1;
            my $drawing_even = 0;
        
            my $group_a = 0;
            my $group_b = 0;
            my $group_c = 0;
        
        
            $chiral{total}++;
        
            $chiral{totalnums} += 5;
        
            foreach (@data) {
                if ($_ % 2 == 0) {
                    $chiral{even}++;
                    $drawing_even++;
        
                } else {
                    $chiral{odd}++;
                    $drawing_odd++;
        
                }
        
                if ($_ < 17) {
                    $chiral{group_a}++;
                    $group_a++;
        
                } elsif ($_ < 33) {
                    $chiral{group_b}++;
                    $group_b++;
        
                } else {
                    $chiral{group_c}++;
                    $group_c++;
        
                }
        
            }
        
            $chiral{"${drawing_even}_even"}++;
            $chiral{"$group_a-$group_b-$group_c"}++;
            
        
        
            foreach (@data) {
        #       print "Normal: $_\n";
                $normals{$_}++;
            }
        }
        
        print "Normal Pick Rate:\n\n";
        
        my @norm_sort = sort { $normals{$a} <=> $normals{$b} } keys %normals;
        
        foreach (@norm_sort) {
            print "$_ :\t($normals{$_})\t", "*" x $normals{$_}, "\n";
        }
        
        print "\nPower Pick Rate:\n\n";
        
        my @power_sort = sort { $powers{$a} <=> $powers{$b} } keys %powers;
        
        foreach (@power_sort) {
            print "$_ :\t($powers{$_})\t", "*" x $powers{$_}, "\n";
        }
        print "\nNormal Picks:\t";
        
        print join(" ", sort (@norm_sort0 .. 11)), "\n";
                              
                              
        print "\nPower Picks:\t";
        
        print join(" ", sort (@power_sort0 .. 3)), "\n";
        
        print "\nOdd/Even:\n";
        printf "\t ODD: %d (%0.1f%%)\n", $chiral{odd}, ($chiral{odd}/$chiral{totalnums})*100;
        printf "\tEVEN: %d (%0.1f%%)\n", $chiral{even}, ($chiral{even}/$chiral{totalnums})*100;
        
        printf "\n0 Even/5 Odd: %d (%0.1f%%)\n", 
            $chiral{"0_even"}, 
            ($chiral{"0_even"}/$chiral{total})*100;
        
        printf "1 Even/4 Odd: %d (%0.1f%%)\n", 
            $chiral{"1_even"}, 
            ($chiral{"1_even"}/$chiral{total})*100;
        
        printf "2 Even/3 Odd: %d (%0.1f%%)\n", 
            $chiral{"2_even"}, 
            ($chiral{"2_even"}/$chiral{total})*100;
        
        printf "3 Even/2 Odd: %d (%0.1f%%)\n", 
            $chiral{"3_even"}, 
            ($chiral{"3_even"}/$chiral{total})*100;
        
        printf "4 Even/1 Odd: %d (%0.1f%%)\n", 
            $chiral{"4_even"}, 
            ($chiral{"4_even"}/$chiral{total})*100;
        
        printf "5 Even/0 Odd: %d (%0.1f%%)\n", 
            $chiral{"5_even"}, 
            ($chiral{"5_even"}/$chiral{total})*100;
        
        printf "\n\nGroup A: %d (%0.1f%%)\n",
            $chiral{"group_a"},
            ($chiral{"group_a"}/$chiral{totalnums})*100;
        
        printf "\n\nGroup B: %d (%0.1f%%)\n",
            $chiral{"group_b"},
            ($chiral{"group_b"}/$chiral{totalnums})*100;
        
        printf "\n\nGroup C: %d (%0.1f%%)\n",
            $chiral{"group_c"},
            ($chiral{"group_c"}/$chiral{totalnums})*100;
        
        foreach my $k (sort {$chiral{$a} <=> $chiral{$b}} keys %chiral) {
            my $v = $chiral{$k};
        
            if ($k =~ /^\d\-/) {
                printf "\n\n%s: %d (%0.1f%%)\n",
                $k, $v, ($v/$chiral{total})*100;
            }
        }
        
        print "\nDisclaimer:\n\tThis is not statistically accurate, except in that the drawings are guaranteed.\nThi
        s is just a quick frequency analysis making no pretenses as to predictive accuracy.\n"
           
        


        Ok, it's not beautiful code but it works and illustrates my point. Let's examine the output:

        Odd/Even:
                 ODD: 677 (50.5%)
                EVEN: 663 (49.5%)
        
        0 Even/5 Odd: 7 (2.6%)
        1 Even/4 Odd: 38 (14.2%)
        2 Even/3 Odd: 96 (35.8%)
        3 Even/2 Odd: 81 (30.2%)
        4 Even/1 Odd: 40 (14.9%)
        5 Even/0 Odd: 6 (2.2%)
        
        
        Group A: 464 (34.6%)
        
        
        Group B: 417 (31.1%)
        
        
        Group C: 459 (34.3%)
        
        
        0-5-0: 1 (0.4%)
        
        
        0-4-1: 3 (1.1%)
        
        
        1-4-0: 3 (1.1%)
        
        
        0-1-4: 3 (1.1%)
        
        
        4-1-0: 4 (1.5%)
        
        
        4-0-1: 7 (2.6%)
        
        
        0-2-3: 8 (3.0%)
        
        
        3-2-0: 9 (3.4%)
        
        
        1-0-4: 9 (3.4%)
        
        
        0-3-2: 9 (3.4%)
        
        
        2-0-3: 11 (4.1%)
        
        
        2-3-0: 14 (5.2%)
        
        
        3-0-2: 16 (6.0%)
        
        
        3-1-1: 20 (7.5%)
        
        
        1-3-1: 20 (7.5%)
        
        
        1-1-3: 26 (9.7%)
        
        
        1-2-2: 33 (12.3%)
        
        
        2-2-1: 34 (12.7%)
        
        
        2-1-2: 38 (14.2%)
        
        
        

        As you can see, you increase your odds by playing 2 odd/3 even and playing 1-2-2, 2-2-1, or 2-1-2 (e.g. 1-2-2 means 1 ball from first group, 2 balls from second group, 2 balls from third group).


        What do you think of this? :)
RE: New Jersey Lottery Probability
by Anonymous Monk on May 04, 2000 at 02:19 UTC
    > 50 * 49 * 48 * 47 * 46 * 36 = $probability; That's not the likelihood of choosing the right 6; it's the likelihood of choosing the right 6 _in a particular order_. -- Malcolm
      You are of course correct it's calculating permutations not combinations, the following sorts that out. No I'm not going to do all obfuscation stuff on it :-0

      #!/usr/bin/perl -w use strict; my ($number, $times) = (50, 5); print &permutation($number, $times) * 36 / ($times + 1); sub permutation { my ($prob, $number, $times) = (1, @_); while($times) { $prob *= ($number - $times + 1) / $times--; }; $prob; };