Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Any perlish hints about Kaplan-Meier Estimator?

by karlgoethebier (Abbot)
on Nov 03, 2021 at 18:31 UTC ( [id://11138400]=perlquestion: print w/replies, xml ) Need Help??

karlgoethebier has asked for the wisdom of the Perl Monks concerning the following question:

Disclaimer: I have no idea.

Background is that son is forced to observe how some parasitic wasps behave under the influence of some «chemical» insecticides vs. some «biological stuff». In other words and very simplified: Watch when they die and plot it.

The uninitiated may take a look at Kaplan-Meier Estimator.

The stuff is currently done less or more in R which is considered being pain in the ass.

Thanks for any inspiration. Regards, Karl

Update: Thanks to all for the kind replies for my somehow undef question. See also:

«The Crux of the Biscuit is the Apostrophe»

Replies are listed 'Best First'.
Re: Any perlish hints about Kaplan-Meier Estimator?
by Fletch (Bishop) on Nov 03, 2021 at 19:06 UTC

    Had some of our R people use a (I believe slightly tweaked locally) Statistics::R for some things at one point, but that's still leaving the heavy lifting and coding in R (which of course is just Lisp reimplemented poorly by mathematicians and statisticians</sarcasm mode="slight">). There's also Math::GSL::Statistics which I think I might have installed at one time for said R people to play with but I have only a marginal recollection of that. It's sitting on top of the GNU statistics library so it should be fairly performant but I think it's also going to be lower-level stuff you'd have to build up to the types of things shown in your reference.

    This (math heavy statistics) is another of those heavy math areas where like the PDE question recently aren't really in perl's wheelhouse as far as off the shelf solutions go.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Any perlish hints about Kaplan-Meier Estimator?
by bliako (Monsignor) on Nov 04, 2021 at 10:03 UTC

    I agree with Fletch:

    This (math heavy statistics) is another of those heavy math areas where like the PDE question recently aren't really in perl's wheelhouse as far as off the shelf solutions go. 

    If this thing is usually done in R (and there is indeed an R package for this sort of thing), leave it as it is and create a Perl wrapper around it. In the past, I tried to re-implement algorithms existing in R, in Perl and soon hit a dead end. My experience was that one can implement one algorithm fairly easily in Perl but then what? You need to combine it with other metrics and algorithms. You will need to do some significance tests to verify your results. Very importantly: you will also need to present results graphically. R is very good at all of these and also most of these are implemented in C or Fortran for speed. The graphical presentations package ggplot2 is one of the best there are all around today. But it's like a lotus flower in a snake pit -- the R programming environment.

    What I ended up doing was to create bash-wrappers (I can't remember why not Perl-wrappers using Statistics::R, I would definetely do Perl-wrappers today but I have not tried Statistics::R extensively. ) like unix-like utilities to enclose procedures that I needed at the time. Which provided a standard CLI and created an R script (which would be quite complex in that it combined several of these algorithms in R, exchanging R-data-structures between them -- the latter is so important) on the fly (possibly from a template) to do what I wanted and produce some results: plots, csv files, html tables.

    In doing the above I found the best of both worlds. Although I would love to have the wealth of CRAN in Perl, it's not "shameful" to use the right tool for the right job. And always think big in that the project may grow well above the language limits.

    15min Edit: added bonus working in R is that there is quite good and easy framework to parallelise things. All my scripts have the -p num-threads CLI option.

    bw, bliako

Re: Any perlish hints about Kaplan-Meier Estimator?
by aitap (Curate) on Nov 04, 2021 at 17:49 UTC
    son is forced to observe
    I'm sorry to hear that. What level of education is that establishment supposed to provide? What would the son prefer to do instead?
    The stuff is currently done less or more in R which is considered being pain in the ass.

    There are probably faster ways of easing the pain in the ass than rewriting everything in Perl. After all, the math involved is complicated, so re-implementing it from scratch in likely to result in a lot of bugs, but it's already been done by a well-known member of the community and relied upon by a lot of people. R is a bit confusing after Perl because of its pass-by-value nature and generics-based OOP, but it can be written in all kinds of paradigms, and the CRAN code quality is really high, especially for packages reviewed by ROpenSci.

    The end result is that the desired plot could be a one-liner in R, depending on the format of the available data and one package being installed: plot(survival::survfit(Surv(time, status) ~ x, data = read.table('data.txt')))

    Care to share your frustrations? Or would you like to send a mail to the R-help mailing list? It's mostly friendly, like PerlMonks.

      Thanks. Very nice. I guess I begin to see it clearly now. Probably I’m a helicopter father. Best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

Re: Any perlish hints about Kaplan-Meier Estimator?
by perlfan (Vicar) on Nov 05, 2021 at 05:34 UTC
    I think it's cool to see such reaching questions, but in case you get no answers, there are 2 recommendations I have:

    • PDL mailing list (see users list) at http://pdl.perl.org/?page=mailing-lists
    • #pdl on irc.perl.org
    Gödel Luck!

      Thanks. BTW, this stuff is complicated. Holy Joe! I'm glad that i don't need to mess around with it. See here. Regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11138400]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-24 14:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found