cmr72 has asked for the wisdom of the Perl Monks concerning the following question:

Hello All, I would like to use the output of a random number generator to pipe into a script that will produce three types of kernel estimates. Uniform, Triangular, Epanechnikov. I believe that there is a module in CPAN for Epanechnikov, but I would need to produce the other two manually. Seems that I am not only inferior in PERL but also in math because I cannot for the life of me figure out how to replicate the formulas in PERL: http://en.wikipedia.org/wiki/Kernel_(statistics) My code for the number generator is below. Random number generator:
#!/usr/bin/perl -w use strict ; my $samplesize = $ARGV[0] ; my $max = $ARGV[1] ; chomp(my $prog = `basename $0`) ; if ($#ARGV < 1) { print "\nUsage: ./randomnumber.pl <samplesize> <max num>\n" ; exit ; } for (my $i = 0; $i<$samplesize; $i++) { my $range = $max ; my $random_number = int(rand($range)) - $max/2; print "$random_number \n"; }
I do not have any real code for the kernel as I am somewhat lost in even starting out.
#!/usr/bin/perl -w use strict ; my $tria ; my $epan ; my $unif ; while ( <STDIN> ) { my @line= split ; my $point = $line[0] ; $tria = 1-abs($point) ; #abs($point)<=1 ? $tria = 1-abs($point) : $tria = 0 ; $epan = .75*(1-$point*$point) ; #abs($point)<=1 ? $epan = .75*(1-$point*$point) : $epan = 0 ; $unif = .5 ; #abs($point)<=1 ? $unif = .5 : $unif = 0 ; print "$point $tria $epan $unif\n" ; }
Thanks ahead for your help.

Replies are listed 'Best First'.
Re: Kernel Density Estimation
by Mr. Muskrat (Canon) on Aug 29, 2012 at 20:25 UTC
Re: Kernel Density Estimation
by philiprbrenan (Monk) on Aug 29, 2012 at 20:24 UTC

    Before you can get to coding some perl its going to be necessary to clarify the problem to be solved.

    It seems that you have three kernel functions, lets pick one and call it: K(u), and some random data which I assume you are going to use to simulate an arbitrary function F(u).

    Now you need to find out how you are going to convolute the two: are you going to integrate their product for all (u), or perhaps F*F*K or what?

    Once this part of the problem is clearer it will be relatively easy to simulate the necessary integrals via a finite summation - Perl is excellent for these kinds of tasks.

    So please clarify the computation required and we can go from there.

      Thank you. I a trying to mimic the output from this website: http://www.wessa.net/rwasp_density.wasp. where you can dump a set o data and it will calculate the kernel estimate. So I guess to answer the question, all of the data 'u' should pass through f(u).
      Hi. I looked at the CPAN module again and realized that the code was available for Epanechnikov:
      0.75*(1-((x-m)/s)**2)/s if abs( (x-m)/s ) < 1 0 otherwise
      However, I don't see what 'm' and 's' are if my only input is 'x'.
        I took a stab at it thinking that 's' was the sum and 'm' was the previous value for 'x'

        I am pretty certain that this is incorrect.

        #!/usr/bin/perl -w use strict ; my %epan ; my @data ; my $sum ; while ( <STDIN> ) { my @line= split ; push (@data,$line[0]) ; } #Epanechnikov: 0.75*(1-((x-m)/s)**2)/s if abs( (x-m)/s ) < 1, otherwis +e 0 for ( my $i = 0 ; $i <= $#data ; $i++ ) { $sum+=$data[$i] ; if ($sum !=0 && abs(($data[$i]-$data[$i-1])/$sum) < 1) { $epan{$i}=0.75*(1-(($data[$i]-$data[$i-1])/$sum)**2)/$sum ; } else { $epan{$i} = 0 ; } } for ( my $i = 0 ; $i <= $#data ; $i++ ) { print "$data[$i] $epan{$i}\n" ; }
Re: Kernel Density Estimation
by flexvault (Monsignor) on Aug 29, 2012 at 20:35 UTC

    Welcome cmr72,

    Have you looked at the 'rand' function:

    perl -E 'say int(rand(100);'
    Works much better than your random number generator.

    Before I could answer you, the above posts are better than the one I was going to suggest.

    Good Luck!

    "Well done is better than well said." - Benjamin Franklin