Warning: This is just for discussion and I am no mathemetician. I know that I must have made some serious mistakes below and I'm not trying to prove a point other than the numbers in the news are usually BS.

I was reading though a /. article which has a link to a PDF file that purports to show that it is extremely improbably that Buchanan could have gotten the number of votes that he did. Basically, it compares the ratio of Buchanan/Bush votes (I suppose on the theory that the more conservatives in an area, the more likely both are to get votes) and makes a claim that Palm Beach county is out of whack.

My boss, a die-hard Republican, sent me an e-mail where some "statistician proves" that it is mathematically impossible for Gore to win a recount (My rather long e-mail pointed out many flaws with the reasoning, starting with some basic addition errors that the "statistician" made).

Now, don't get me wrong, I think there were some problems with the vote. But I am getting tired of people just pulling numbers out of thin air. So to continue the trend, I decided to play amateur proctologist and find some numbers of my own with a flashlight and a pair of gloves (really, you don't want to visualize that).

What follows is a terribly written hack that is a quickie combination of two scripts that I wrote. It reads a list of Florida counties from the CNN Web site and writes them to a file. Then, it reopens the file (I told you it was a bad hack!) and reads the data back in and uses the county list to pull the vote statistics from the CNN Web site. Then, it uses the Statistics::Descriptive module to figure out the mean (average) of Buchanan and Gore's percentage of votes per county. Then, I compute the standard deviation for each of these candidates percentages. From there, I compare these figures with the figures from the disputed Palm Beach counties to determine if this shows voting discrepancy.

#!C:\perl\bin\perl.exe -w use strict; use LWP::Simple; use Data::Dumper; use Statistics::Descriptive; my $buch_stat = Statistics::Descriptive::Full->new(); my $gore_stat = Statistics::Descriptive::Full->new(); my $counties = "counties.dat"; use constant URL => 'http://www.cnn.com/ELECTION/2000/results/FL/'; my $url_data = get( URL ); my @data = split /\n/, $url_data; open OUT, ">$counties" or die $!; foreach my $line ( @data ) { print OUT "$1|$2\n" if $line =~ /<option value="(\d+)">([^<]+)<\/o +ption>/; } close OUT; use constant BASE_URL => 'http://www.cnn.com/ELECTION/2000/results/FL/ +'; my @candidates = qw( Gore Bush Nader Browne Buchanan Hagelin Phillips +); open COUNTIES, "<$counties" or die "Could not open $counties: $!"; my ( %votes, @gore_percent, @buchanan_percent ); while (<COUNTIES>) { my ( $number, $county ) = split /\|/; chomp $county; print "Processing $county:"; my $url_data = get( BASE_URL . $number ); my $county_total = 0; for my $candidate ( @candidates ) { if ( $url_data =~ /$candidate[^\d]+(\d+,?(?:\d+)?)/s ) { my $vote_count = $1; $vote_count =~ s/,//; $votes{ $candidate }{ $county } = $vote_count; $votes{ $candidate }{ 'Total' } += $vote_count; $county_total += $vote_count; } } my $buch_percent = $votes{ 'Buchanan' }{ $county } / $county_total +; my $gore_percent = $votes{ 'Gore' }{ $county } / $county_total; push @buchanan_percent, $buch_percent; push @gore_percent, $gore_percent; print "\tGore: " . ( sprintf "%.3f", $gore_percent*100 ) . "%"; print "\tBuch: " . ( sprintf "%.3f", $buch_percent*100 ) . "%\n"; } $buch_stat->add_data( @buchanan_percent ); my $buch_mean = $buch_stat->mean(); my $buch_sdev = $buch_stat->standard_deviation(); $gore_stat->add_data( @gore_percent ); my $gore_mean = $gore_stat->mean(); my $gore_sdev = $gore_stat->standard_deviation(); my $target_county = "Palm Beach"; my $target_total = 0; foreach my $candidate ( @candidates ) { $target_total += $votes{ $candidate }{ $target_county }; } my $target_buch = $votes{ 'Buchanan' }{ $target_county }; my $target_gore = $votes{ 'Gore' }{ $target_county }; my $target_buch_percent = $target_buch / $target_total; my $target_gore_percent = $target_gore / $target_total; my $buch_diff = $target_buch_percent - $buch_mean; my $gore_diff = $target_gore_percent - $gore_mean; print "\n$target_county\tTotal votes: $target_total\n\t" . "Buchanan votes: $target_buch\n\tGore votes: $target_gore\n"; print "Buchanan \% in $target_county: " . (sprintf "%.5f", $target_buc +h_percent*100) . "%\n" . "\tDifference between this percent and his mean: " . (sprintf "% +.5f", $buch_diff*100) . "%\n" . "\tStandard deviations: " . (sprintf "%.5f", $buch_diff / $buch_ +sdev) . "\n"; print "Gore in \% $target_county: " . (sprintf "%.5f", $target_gore_pe +rcent*100) . "%\n" . "\tDifference between this percent and his mean: " . (sprintf "% +.5f", $gore_diff*100) . "%\n" . "\tStandard deviations: " . (sprintf "%.5f", $gore_diff / $gore_ +sdev) . "\n";

The result?
Palm Beach Total votes: 432695 Buchanan votes: 3407 Gore votes: 269696 Buchanan % in Palm Beach: 0.78739% Difference between this percent and his mean: 0.31976% Standard deviations: 0.99817 Gore in % Palm Beach: 62.32935% Difference between this percent and his mean: 19.70633% Standard deviations: 2.16736
Oh my goodness! Gore is more than two standard deviations from the mean! There's only about a 2% chance of that! Call the presses! Obviously Gore's the cheater!

That, of course, is total BS and these numbers are completely unscientific. But, I'm willing to bet that a lot of my Republican friends would accept these numbers on face value because they fit what they want to believe!

Really, folks. Most of the numbers being tossed around (just like mine) are useless. Don't let 'em fool ya. And don't accept something just because it fits what you believe.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just go the the link and check out our stats.


In reply to Perl vs. Buchanan by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.