Shuzaku has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks. I am using WWW::Mechanize. I try to get a web page with $m->get($url). Everything work good with test pages (I manage to get content from this server). But when I try to get content from the same server of a 10MB text file with around 150 000 lines, I can't run the script, the script doesn't sucess (no error message) and I must restart my console. (Therefore when I put the 10MB text file in local everything work correctly). How can I do to get this 150 000 ? In fact I need only the 10 000 last lines of this file whish is update in real time. Thank you !
  • Comment on Get content of a big text file on a server

Replies are listed 'Best First'.
Re: Get content of a big text file on a server
by BrowserUk (Patriarch) on Aug 11, 2009 at 16:34 UTC

    If the server is configured for Range Requests; and if you can estimate where the 10,000 lines you want will start; then you can save downloading the whole lot and just get the bit you need. Which will be much faster.

    For example. You say that the file is 10MB and that is 150,000 lines; and you want just the last 10,000. Assuming reasonably equal length lines, they will average 70 chars per line, so all you want is the last 700kb which is just 7% of the whole thing.

    Using LWP::UserAgent you can do a range request something like this like this:

    $ua->get( $url, Range => 'bytes=-1000000' );

    That will fetch the last million chars of the file which you can read backward to get your 10,000 lines.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Awesome ! This is very fast and efficient ! Thank you a lot
Re: Get content of a big text file on a server
by mzedeler (Pilgrim) on Aug 11, 2009 at 14:36 UTC
      Exactly the same problem with using that (work with a little text file but not with the big). It can't be because the big text file haven't the extension ".txt" (It have no extension) ?
        I correct my last post, it works after an execution time of 5 min ! I will see what I can do to improve this execution time. Thnaks for all !