http://qs1969.pair.com?node_id=741894

I am trying to combat an image problem -- in others' heads and, consequently, also in my own head. Using Perl for science.

Even with the existence of PDL, and tons of modules on CPAN that do science-y stuff, from a very non-scientific scan of the net, seems like, besides C++ and Java, numpy and scipy get more mindshare. Heck, even PHP and Ruby are getting picked up.

I was recently talking to someone who works for a company making a triple-store database. The rdfstore comes with bindings for Java, Python, C#, Lisp and even Ruby. When I asked him about Perl, he said,

db-guy: "Perl... that would be weird." me: "Why would that be weird?" db-guy: "Well, Perl is used for text and strings... it is not really u +sed for programming scientific applications" me: "and, you think Python and Ruby are better at scientific applicati +ons?" db-guy: "uh huh... I haven't really used Perl or Python or Ruby. It se +ems that Perl is used more for CGI"

and so it goes.

Recently I was following up on MachetEC2... http://forums.flowingdata.com/topic/machetec2-open-visualization-big-data-toolkit-on-amazon-ec2

once again, no mention of Perl.

As a Perl lover, I feel neglected, but realize that pouting is not useful. Unfortunately, I am not good enough with Perl to make my own bindings and release them. So, I am doing the least I can do -- come to the monks and kvetch.

Now I feel better, and am going back to learning PDL.

--

when small people start casting long shadows, it is time to go to bed

Replies are listed 'Best First'.
Re: Perl for science
by swampyankee (Parson) on Feb 06, 2009 at 15:09 UTC

    Kvetching can be useful for relieving one's stress ;-)

    Leaving out the comparison to C++ (and kvetching about your ignoring Fortran, which is still used for numerically-intensive programming), I suspect that Java benefits from its supposed (and non-existent) perfect portability and its general similarity to C++, while Ruby has benefited from The Art of Computational Science by Piet Hut and Jun Makino. Perl certainly does suffer because of its history of use in string manipulation and system administration.

    The bit about sysadmins is probably a larger factor than many scientific or engineering software developers will admit. In most of the engineering environments in which I've worked, the sysadmins have been felt to be most concerned with keeping people with domain knowledge away from the computers. Since sysadmins are viewed as obstructions, their tools are going to be deprecated.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Perl for science
by kennethk (Abbot) on Feb 06, 2009 at 15:11 UTC

    Next time you talk to db-guy, you can point out bioperl. It is an open-source tool for bioinformatics and the source of many questions on PM. It leverages exactly Perl's strengths for processing genetic information - strings of [ATGC].

    On a more abstract level, I'm curious how you are defining "science", since almost all the apps I know in science are written in either Fortran or C. If you are doing any kind of serious numerical computation, you would be hard pressed to get the kind of speed you can get out of well-optimized Fortran code, plus there's the legacy effect. I personally am trying to leverage Perl's and Fortran's strengths (they are very complementary) toward my own work, though $work is getting in the way there.

      Responding to both swampyankee and kennethk, there is little point in talking to the db-guy about bioperl, and for the art of computational science, we have the brilliant if needing an update wolf book.

      I guess an image problem is countered typically with a campaign. A consolidated campaign of the http://perlbuzz.com kind but directed to science would be a good way.

      By the way, I am not attempting to define or pigeon-hole science here. In my world, science is image and geographic analysis and data visualization. Perl can and does help in both, and can work in conjunction with better tools such as Processing (for visualization). Actually, come to think of it, a Perl module that interfaces with Processing would really make me smile.

      Thanks for the responses though. I already feel a bit better.

      --

      when small people start casting long shadows, it is time to go to bed
      Speaking as a bio and Perl person, bioperl is usually an overengineered waste of time. As to your original question, the main factor is usually what your colleagues use. The second question is usually how easily the language interfaces with C, FORTRAN, and the shell. Biology is text-intensive, especially in its scripty parts, so Perl is a good language for it, but even with PDL, it can't compete with Octave for numerical stuff.
Re: Perl for science
by hda (Chaplain) on Feb 08, 2009 at 15:15 UTC
    Punkish, I fully understand your situation and seemingly many others in this forum do. I am a scientist and migrated from FORTRAN to Perl some years ago. I use Perl for almost everything from rearranging text files to even some heavy calculations, the latter almost exclusively with PDL. My work involves lots of cartography and geographic calculations (coordinates and so on). In my opinion, and without getting into useless comparisons with other languages, Perl offers an amazing variety of modules and programming structures that greatly facilitate scientific work in almost any aspect. Of course, this is only once you get used to the language, but this applies to any other language or tool, isn't it?

    In my opinion, your despair has to do with the fact that it is always difficult, and sometimes futile, to combat misconceptions fixed in some people's minds. The biggest enemy of scientists is not ignorance, but narrow-mindedness. So, do not loose time and energy with those that are closed in their beliefs (this might also apply to other aspects of life).

    Go on with Perl, your example, if good, will suffice to show others the power of the camel!

      I'm a biologist and I use R and Perl for statistical computing. I think that R would have something to offer for people working with e.g. geographic calculations. Check www.r-project.org or http://addictedtor.free.fr/graphiques/allgraph.php?sort=votes (for graphs)
        There are Perl modules to assist in working with R: such as R::Writer.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Perl for science
by dhruv (Initiate) on Feb 13, 2009 at 03:10 UTC

    I'm one of the creators of machetEC2, the Amazon Machine Image designed for working with data on Amazon's EC2 platform.

    machetEC2 comes with Perl but no Perl modules. The reason for this is simple: I'm not a Perl hacker and I didn't know which ones to include!

    @punkish (or anyone else), I'd LOVE to get an email with a list of the best packages/modules Perl has for doing science. We'll bundle them into the next version of machetEC2. (If you're feeling lazy, just leave a list of them in the comments to this node -- I'll check it later...)

    Even better: check out the build scripts for machetEC2 at the Infochimps GitHub page and add whatever packages you think are necessary to config/packages.yaml. Send Infochimps a pull request and we'll include your changes!
      These are the ones that I've used for science:

      PerlMol (Chemistry)
      BioPerl (Bioinformatics)
      PDL (Numerical computation)
      AI::Genetic::Pro (Genetic Algorithm)

      I haven't been able to build it yet, but I'm also looking forward to using Math::GSL, a Perl wrapper for the Gnu Scientific Library. It looks very promising.
      Update: I just did. Apparently, the module doesn't like libgsl 1.10 (a lower or higher version will do).

        @bruno, thanks! i'll put those into the next build of machetEC2.

        any other suggestions?

Re: Perl for science
by why_bird (Pilgrim) on Feb 13, 2009 at 12:05 UTC

    I have to say, I am probably one of your 'target audience'. I'd like to think I'm open to change so I'd like to hear more specifics about why or perhaps where perl fits in as a scientific application.

    As a physics undergrad, I was taught Fortran (well, a bit) and supervisors, friends and profs used Fortran or C for numerical simulations. At $work, I learned perl, mostly because I was getting p****d off at shell scripting! But I'm glad I did because I came to like perl in its own right. However, when writing numerical simulations at work, I (have to) use C or C++.

    My boyfriend (working for a PhD in image forensics) swears by python for its numpy and scipy libraries, but again writes most of his processing intensive code in C.

    So the perception is definitely there in my mind. I'm your perfect stereotype---Perl is great for its ease of use, flexibility and text processing capabilities, but C/C++ and python are the way to go for science, at least that's the impression I've always been given by the people around me. So what is it specifically that makes perl suitable/better for these applications? Or what needs doing to make perl better? Writing libraries? Or do you think it's only a matter of perception?

    why_bird
    ........
    Those are my principles. If you don't like them I have others.
    -- Groucho Marx
    .......
      I'd like to hear more specifics about why or perhaps where perl fits in as a scientific application

      In a nutshell, if you extend perl (with XS) to incorporate that fortran/C/C++ code (as does PDL, Math::Pari, Math::GSL and Math::GMP for example), then you end up with something that's as easy to use as perl, but performs its tasks as quickly as fortran/C/C++.

      Cheers,
      Rob
      So what is it specifically that makes perl suitable/better for these applications?

      Libraries. Modules. CPAN. I'm not going to say that Perl is better than any other niche language in science. But I am willing to argue that it is no worse than python/java/ruby/whatnot for general-purpose scientific computing.

      The proof for this is my personal experience. I am not married to Perl in any way. My data is more important than my favorite language. So I said to myself; the day that I felt held back by Perl, I'd leave it (albeit temporarily) for the next better tool. And even though I've been only coding for a couple of years, that day has yet to come, and I've used a wide variety of data-crunching modules.