On Win32 I use LWP::Simple to fetch files from the web, but sometimes the file is to big and causes my script run to forever, if I cancel the script out before before it's finished, the temp file I am writing gets write protected. If I run the script again, on a smaller file, it uses the data from the last session plus the new pages data. Does any have a solution for checking the file size first before grabbing it and putting it into a variable? Or if the script is taking to long to process the data, to quit and unlink the temp file.I AM NEW TO PERL, so don't everybody laugh at one time. Here is my code
use LWP::Simple;
if ($ENV{'QUERY_STRING'}) { $buffer = $ENV{'QUERY_STRING'}; }
else { read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); }
@pairs = split(/&/, $buffer);
foreach $pair (@pairs) {
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
$value =~ s/\n/ /g;
$FORM{$name} = $value;
}
print "Content-type: text/html\n\n";
$URL = $FORM{'url'};
$page = get($URL);
$page =~ s/\s+/ /g;
# I do more with the $page varaible later but I break it off here.
# Strip most of the HTML, <script>, <style> and punctuation.
# I think it's greedy. Any help? I perfer without perl module?
$break = $page;
$break =~ tr/A-Z/a-z/;
$break =~ s/ \;/ /g;
$break =~ s/<s.*?<\/s.*?>//igs;
$break =~ s/<\;//igs;
$break =~ s/>\;//igs;
if(!($break)) { &error; &print_footer; exit; }
&print_main_header;
($text = $break) =~ s/<(\/|!)?[-.a-zA-Z0-9]*.*?>//g;
$text =~ s/[,.?':!"@#\$\%&*()_|\/\-=+\^~`\{\}\[\]\\]//g;
$text =~ s/\s+/ /g;
@text = split(/\s/, $text);
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.