Devshed has published an interesting article written by a fellow monk.


The article has definitely a PerlMonks flavor (It is somehow rooted in it).
It seems the perfect representation on how to use the correct tools for the job. Plenty of modules usage, with examples.
I enjoyed it, as I am sure many of you will.

update
I was so pleased to see this abundance of modules that I failed to notice that the use strict was missing!
My editor adds it automatically in my new scripts, but it would be better to have it in the first place. Thanks broquaint.
_ _ _ _ (_|| | |(_|>< _|

Replies are listed 'Best First'.
Re: article on 'data mining with Perl'
by broquaint (Abbot) on Mar 07, 2002 at 12:09 UTC
    However, the concluding code does
    a) not use strict, and
    b) does not use warnings or even -w
    While it does make great use of modules (yay!), it's coding practice leaves a little to be desired :-/

    broquaint

Re: article on 'data mining with Perl'
by jlf (Scribe) on Mar 07, 2002 at 21:16 UTC
    I have used LWP::Simple to get and parse web pages in the past, but I haven't yet learned what tools would be appropriate to access some of the more interesting (to me) data out there. I'm referring to web sites that require login before the data of interest is available. For examples: real-time stock quotes from brokerage accounts, automated retrieval of web-based email, automated retrieval of online banking transactions, etc.

    The answer to this is undoubtedly RTFM, but any pointers to particular documentation would be great.

    Josh

      Read lwpcook, which comes with perl 5.6.1, or at the end of a google search. This has examples of various neato things you can do with LWP::UserAgent and it's related friends.

      -Any sufficiently advanced technology is
      indistinguishable from doubletalk.

        Thanks HZ -- I bet that an evening spent with lwpcook and the HTTP RFC will be just the ticket.

        Josh

Re: article on 'data mining with Perl'
by Anonymous Monk on Mar 07, 2002 at 18:44 UTC
    The title is misleading. It should be "web mining" and not "data mining". Data Mining is usually referred to the practice of extracting meaningful information stored in databases (like customers' buying patterns, etc.). This article, on the other hand, talks about downloading and extracting content from web pages. On a superficial level they might be similar, but the intent is different.
Re: article on 'data mining with Perl'
by mojotoad (Monsignor) on Mar 05, 2003 at 19:31 UTC
    The author fails to mention one of the more useful features of HTML::TableExtract -- extracting by column headers, as opposed to depths and counts. Flexibility good, brittle bad. (not to mention, headers are the main reason gridmapping is valuable)

    ;)

    Matt