moked has asked for the wisdom of the Perl Monks concerning the following question:
Hi Monks,
I'm using the next code in order to put a file on a remote place
which indicates if I have a new mail on my exchange server. the script is doing exactly what it should. but it has one flaw, after it runs for about one day it consumes about 400MB of memory.
I'm trying to solv this issue but so far no use.
The script is running on a win XP SP3 OP SYS
#!/usr/bin/perl
use WWW::Mechanize;
use HTTP::Cookies;
use Stream::Reader;
$url="https://mymail.company.com";
my $username = "XXXXXXXXXX";
my $password = "xxxxxxxxxx";
my $mechanize = WWW::Mechanize->new(autocheck => 1);
CHK_STRT:
$mechanize->cookie_jar(HTTP::Cookies->new());
$mechanize->credentials($username,$password);
$mechanize->get($url);
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/?Cmd=c
+ontents&Page=1");
my $page = $mechanize->content();
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/Sub_di
+r_1/?Cmd=contents&Page=1");
my $Sub_1>get("https://mymail.company.com/exchange/User/Inbox/Sub_dir_
+2/?Cmd=contents&Page=1");
my $Sub_2 = $mechanize->content();
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/sub_di
+r_3/?Cmd=contents&Page=1");
my $Sub_3 = $mechanize->content();
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/sub_di
+r_4/?Cmd=contents&Page=1");
my $Sub_4 = $mechanize->content();
open(FH, ">school.txt") or die " Can't open school file\n";
binmode FH, ':utf8';
print FH $page;
print FH $Sub_1;
print FH $Sub_2;
print FH $Sub_3;
print FH $Sub_4;
close(FH);
my @substrings = (
'icon-msg-unread.gif'
);
my $handler;
open( $handler,'<','school.txt' ) or die "can't Reopen the file\n";
my $stream = Stream::Reader->new( $handler );
my $result = $stream->readto(\@substrings, {Mode => 'E'}); #This mode
+returns false
$emails = 1;
close $handler;
open(WR, ">announce.txt");
if( $result ) {
print WR "new\n"; $emails++;
} elsif( $stream->{Error} ) {
die "Fatal error during reading file!\n";
} else {
print WR "old\n";
}
close WR;
unlink('C:/MC/school.txt');
system 'ftp -s:ftpc ftp.server > Log.log';
unlink 'Log.log';
sleep(60);
goto CHK_STRT;
Thanks ahead,
Moked
Re: memory consumption
by bangers (Pilgrim) on Jul 07, 2009 at 09:47 UTC
|
as a suggestion, I'd scope your variables and create $mechanize on each iteration. This may help the garbage collection to free up the memory
#!/usr/bin/perl
use WWW::Mechanize;
use HTTP::Cookies;
use Stream::Reader;
$url="https://mymail.company.com";
my $username = "XXXXXXXXXX";
my $password = "xxxxxxxxxx";
CHK_STRT:
{
my $mechanize = WWW::Mechanize->new(autocheck => 1);
$mechanize->cookie_jar(HTTP::Cookies->new());
$mechanize->credentials($username,$password);
$mechanize->get($url);
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/?Cmd
+=contents&Page=1");
my $page = $mechanize->content();
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/Sub_
+dir_1/?Cmd=contents&Page=1");
my $Sub_1>get("https://mymail.company.com/exchange/User/Inbox/Sub_di
+r_2/?Cmd=contents&Page=1");
my $Sub_2 = $mechanize->content();
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/sub_
+dir_3/?Cmd=contents&Page=1");
my $Sub_3 = $mechanize->content();
$mechanize->get("https://mymail.company.com/exchange/User/Inbox/sub_
+dir_4/?Cmd=contents&Page=1");
my $Sub_4 = $mechanize->content();
open(FH, ">school.txt") or die " Can't open school file\n";
binmode FH, ':utf8';
print FH $page;
print FH $Sub_1;
print FH $Sub_2;
print FH $Sub_3;
print FH $Sub_4;
close(FH);
my @substrings = (
'icon-msg-unread.gif'
);
my $handler;
open( $handler,'<','school.txt' ) or die "can't Reopen the file\n";
my $stream = Stream::Reader->new( $handler );
my $result = $stream->readto(\@substrings, {Mode => 'E'}); #This mod
+e returns false
$emails = 1;
close $handler;
open(WR, ">announce.txt");
if( $result ) {
print WR "new\n"; $emails++;
} elsif( $stream->{Error} ) {
die "Fatal error during reading file!\n";
} else {
print WR "old\n";
}
close WR;
}
unlink('C:/MC/school.txt');
system 'ftp -s:ftpc ftp.server > Log.log';
unlink 'Log.log';
sleep(60);
goto CHK_STRT;
I haven't tried this code, but it may make a starting point. | [reply] [d/l] |
|
| [reply] |
|
You can probably also use the back() method after/before each request. As far as I can tell, that should erase the last visited page from memory.
| [reply] |
Re: memory consumption
by JavaFan (Canon) on Jul 07, 2009 at 10:37 UTC
|
Instead of sleeping, you may be better off running the script from cron (or whatever they have on Windows). That has the advantage that if the script dies (and you do have a couple of die statements - your program is clearly not written with durability in mind), a minute later, another instance tries again.
As for the memory leak, a classical solution for such daemons is to have them exec() themselves every once in a while. This doesn't work (easily) if the daemon needs to keep state, but your program doesn't. So for instance, you could keep a counter which you increment each loop, and once the counter goes over 500 (or some other number), instead of the goto, you do an exec $0;
Still, I'd go for the crontab solution. | [reply] |
|
Some notes:
Often, it is a bad when two instance of a program run at the same time. So, when you run the script using "scheduled tasks", and this is a problem, check that the current instance is the only instance (which is another common problem).
Windows has no exec() system call, so that trick won't work on Windows. (Windows also lacks fork(). Another reason to stay away from Windows. ;-) Recent perl versions try to emulate both, but the emulation is far from being complete - simply because Windows has no equivalent of that API calls.)
exec $0 removes all command line arguments. Often, you don't want that. exec($0,@ARGV) keeps them.
In any case, exec() removes all context your program had, it literally starts from the beginning. If you need some state information, you have to keep it outside of the process, e.g. in a file or in an environment variable.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
| [reply] [d/l] [select] |
|
use Fcntl ':flock';
open my $me, $0 or die;
flock $me, LOCK_EX|LOCK_NB or exit;
near the beginning of your program usually does the trick. | [reply] [d/l] |
Re: memory consumption
by Your Mother (Archbishop) on Jul 07, 2009 at 16:38 UTC
|
Extending what bangers and Corion said already, you can ask Mech to not keep a history if you need or prefer to have a single object (this was documented incorrectly in earlier versions but I think it's worked for a long time-
$mech->stack_depth(0)
| [reply] [d/l] |
Re: memory consumption
by missingthepoint (Friar) on Jul 08, 2009 at 10:06 UTC
|
This seems to be the most common problem with WWW::Mechanize... people leaving it running for long periods and running out of memory. I'm writing a tutorial for Mechanize which I'll post on PM soon - next few days hopefully. I'll include this in the 'troubleshooting' bit.
The zeroeth step in writing a module is to make sure that there isn't already a decent one in CPAN. (-- Pod::Simple::Subclassing)
| [reply] |
|
|