This is really a question about web servers, but it relates to Perl so I know you'll forgive me.

I have a Perl script which goes off to various websites and aggregates content, suitably HTML-munged, for the Palm Pilot.

Another script, over which I have no control, then hits on the page created and downloads it to the Palm (AvantGo, for those who know).

I was doing it in this way (pseudocode):

for each page-I-want-to-get{ get it using LWP:Simple or skip if error munge the HTML write it out to a file } then write out a new "index.html" file with links to all the files rewritten at the previous step.

What I would do is manually run the script that did all that, and then the other script would come by and hit on the "index.html" file created by it. Which is inefficient. So I re-wrote it to get the HTML, write the pages and output the links to them all in one.

Then the other script started saying that my script was "idle too long" and time out when accessing it.

So I wrote it another way:

for each page-I-want-to-get{ check it's up using LWP:Simple's head(); or skip if error } output a new "index.cgi" page with links to all the files which are *about* to be created from the pages which passed the test above then go and actually get the pages' html , munge it and write the files

Now the other script was happy and my script wasn't timing out on it.

Then I thought -- that's not logical. My script takes the same time to run, in fact more. Just because it stops outputting text/html to the other script at an earlier point, doesn't mean it actually takes less time to run.

How does a web server "know" when a CGI script is finished, in other words? It surely isn't just when it sends

print '</body></html>';

And what's the effect of putting $| = 0; at the top of my script? I thought that would have it output the header and top of the page, just to "keep the other script interested" kind of thing, before it output the main parts of the content, but that didn't seem to help either.

--
Weaselling out of things is important. It's what separates us from the animals ... except the weasel.


In reply to Perl Output/Web Server Question by Cody Pendant

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.