comment on

I agree with Fletch:

This (math heavy statistics) is another of those heavy math areas where like the PDE question recently aren't really in perl's wheelhouse as far as off the shelf solutions go.

If this thing is usually done in R (and there is indeed an R package for this sort of thing), leave it as it is and create a Perl wrapper around it. In the past, I tried to re-implement algorithms existing in R, in Perl and soon hit a dead end. My experience was that one can implement one algorithm fairly easily in Perl but then what? You need to combine it with other metrics and algorithms. You will need to do some significance tests to verify your results. Very importantly: you will also need to present results graphically. R is very good at all of these and also most of these are implemented in C or Fortran for speed. The graphical presentations package ggplot2 is one of the best there are all around today. But it's like a lotus flower in a snake pit -- the R programming environment.

What I ended up doing was to create bash-wrappers (I can't remember why not Perl-wrappers using Statistics::R, I would definetely do Perl-wrappers today but I have not tried Statistics::R extensively. ) like unix-like utilities to enclose procedures that I needed at the time. Which provided a standard CLI and created an R script (which would be quite complex in that it combined several of these algorithms in R, exchanging R-data-structures between them -- the latter is so important) on the fly (possibly from a template) to do what I wanted and produce some results: plots, csv files, html tables.

In doing the above I found the best of both worlds. Although I would love to have the wealth of CRAN in Perl, it's not "shameful" to use the right tool for the right job. And always think big in that the project may grow well above the language limits.

15min Edit: added bonus working in R is that there is quite good and easy framework to parallelise things. All my scripts have the -p num-threads CLI option.

bw, bliako

In reply to Re: Any perlish hints about Kaplan-Meier Estimator? by bliako
in thread Any perlish hints about Kaplan-Meier Estimator? by karlgoethebier

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.