note
BrowserUk
<blockquote><i>... is there a reason you aren't using the Chi-square test?</i></blockquote>
<p>I'm not trying to solve the [http://itre.cis.upenn.edu/~myl/languagelog/archives/000651.html|sample] problem. That was just an example I found on the web and used as a test.
<p>I set out to solve the [481674|problem] of performing the FET using Perl. First I did it with [cpan://Math::Pari], but which gives a result of 8.070604647867604097576877668E-7030 in 26ms, but I was unsure about the accuracy. It also imposes a binary dependancy.
<code>
#! perl -slw
use strict;
use Benchmark::Timer;
use List::Util qw[ sum reduce ];
use Math::Pari qw[ factorial ];
$a=$b;
sub product{ reduce{ $a *= $b } 1, @_ }
sub FishersExactTest {
my @data = @_;
return unless @data == 4;
my @C = ( sum( @data[ 0, 2 ] ), sum( @data[ 1, 3 ] ) );
my @R = ( sum( @data[ 0, 1 ] ), sum( @data[ 2, 3 ] ) );
my $N = sum @C;
my $dividend = product map{ factorial $_ } grep $_, @R, @C;
my $divisor = product map{ factorial $_ } grep $_, $N, @data;
return $dividend / $divisor;
}
my $T = new Benchmark::Timer;
$T->start( '' );
print FishersExactTest 989, 9400, 43300, 2400;;
$T->stop( '' );
$T->report;
__END__
P:\test>MP-FET.pl
8.070604647867604097576877668E-7030
1 trial of _default ( 25.852ms total), 25.852ms/trial
</code>
<p>So, then I coded it using [cpan://Math::BigFloat]
<code>
#! perl -slw
use strict;
use Benchmark::Timer;
use List::Util qw[ reduce ];
use Math::BigFloat;
$a=$b;
sub product{ reduce{ $a *= $b } 1, @_ }
sub sum{ reduce{ $a += $b } 0, @_ }
sub FishersExactTest {
my @data = map{ Math::BigFloat->new( $_ ) } @_;
return unless @data == 4;
my @C = ( sum( @data[ 0, 2 ] ), sum( @data[ 1, 3 ] ) );
my @R = ( sum( @data[ 0, 1 ] ), sum( @data[ 2, 3 ] ) );
my $N = sum @C;
my $dividend = product map{ $_->bfac } grep $_, @R, @C;
my $divisor = product map{ $_->bfac } grep $_, $N, @data;
return $dividend / $divisor;
}
my $T = new Benchmark::Timer;
$T->start( '' );
print FishersExactTest 989, 9400, 43300, 2400;;
$T->stop( '' );
$T->report;
</code>
<p>But that ran for 20 minutes without producing any output before I killed it (I've set it running again now, and my machines fan has been thrashing at full speed for the last 25 minutes).
<p>Whilst I was waiting for the BigFloat version, I coded this version which attempts to reduce the size of the problem by eliminating (exactly common) factors:
<code>
sub FishersExactTest2 {
my @data = @_;
return unless @data == 4;
my @C = ( sum( @data[ 0, 2 ] ), sum( @data[ 1, 3 ] ) );
my @R = ( sum( @data[ 0, 1 ] ), sum( @data[ 2, 3 ] ) );
my $N = sum @C;
my %dividends; $dividends{ $_ }++ for map{ factors $_ } grep $_, @R, @C;
my %divisors; $divisors { $_ }++ for map{ factors $_ } grep $_, $N, @data;
for my $i ( keys %divisors ) {
if( exists $dividends{ $i } ) {
$divisors{ $i }--, $dividends{ $i }--
while $divisors{ $i } and $dividends{ $i };
delete $divisors { $i } unless $divisors { $i };
delete $dividends{ $i } unless $dividends{ $i };
}
}
my $dividend = product( map{ ( $_ ) x $dividends{ $_ } } keys %dividends );
my $divisor = product( map{ ( $_ ) x $divisors { $_ } } keys %divisors );
return $dividend / $divisor;
}
</code>
<p>This works well for values smallish values, but cannot handle the example I gave above (NV overflow).
<p>It was then I started thinking about how to eliminate more factors from the equation so as to reduce the size of the intermediate terms, and posted my SoPW. I think that [hv]'s solution of expanding all terms to their prime factorizations before performing the cancelling out will be a winner--but I haven't finished coding that yet.
<div class="pmsig"><div class="pmsig-171588">
<hr />
<font size=1 >
<div>Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.</div>
<div>Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?</div>
<div>"Science is about questioning the status quo. Questioning authority". </div>
<div>The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.</div>
</font>
</div></div>
481987
482064