Shweta has asked for the wisdom of the Perl Monks concerning the following question:

I want to extract lots of data from Gramene, a public website containing info on grains. I want to extract data from multiple pages. The data is from fixed fields. Can I do it programmatically?

Replies are listed 'Best First'.
Re: extract data from website
by moritz (Cardinal) on Sep 01, 2008 at 10:05 UTC
    Why scrape when you can simply download it?

    (If you happen to want to extract it anyway, unpack will help you with fixed width data).

      Good idea, but I cannot seem to get the FTP-site working through FireFox.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        It responds a bit slow for me, but works fine both with firefox and wget.

        Maybe your settings for anonymous ftp login are incompatible with that site?

Re: extract data from website
by CountZero (Bishop) on Sep 01, 2008 at 12:10 UTC
    Could you give the URL of (one of) the page(s) you are trying to extract data from? It would help us to give you some useful pointers how to tackle this problem.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: extract data from website
by Anonymous Monk on Sep 01, 2008 at 10:05 UTC

    Yes, you can.

Re: extract data from website
by shekarkcb (Beadle) on Sep 01, 2008 at 12:02 UTC

    I Think suitable module is WWW::Mechanize.
    By this u can scrape/get done your job easily.

    Thanks
    ShekarKCB