Monkless has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I come seeking wisdom :) I have created the below script to essentially, pull some DB data and using that data make API calls to a webservice -- It seems this script exhausts memory/swap after running for 5 - 10 minutes. The DB call returns 10,000 devices that need to be deleted... I can't seem to find where the possible memory leak exists, could you tell me if this a job for Threads or if I should be looking else where(currently reveiwing Fork::ParallelManager)? I feel, this script should be able to knock this out no problem, but alas, this is not the case... It doesn't seem to matter how many threads I configure for MAX, 2-25 I have tried so far and each time this script ends with "Killed" Watching TOP on the system running, with each iteration I can see MEM usage creep up to 80-85% and SWAP nearly exhausted... As you can see, Im logging my work and I can see that Im able to get through roughly 1,000 deletes before the "Killed" message.. any ideas?!

#!/usr/bin/perl use strict; use warnings; use DBI; use Data::Dumper; use Getopt::Long; use threads; use LWP::UserAgent; my $version = '3.0'; my $build = '07302015'; my $debug = ""; my $maxThreads = '25'; my $countLimit = '0'; my $deleteLimit = '10000'; my $environment = "dev"; my $report_filenme = "deletedDevices"; my @row; my $deviceLogDIR; chomp (my $reportDate = `date +\%Y\%m\%d`); my @deletedDevices; my %active_threads = (); #Config Output Options my $output = CWM_configDelete->get_output(); my $date = "$output->{date}"; my $out_dir = "$output->{output_dir}"; chomp ($date); #Config Curl Options(Pulling these from a config.pm file I created as +we have multiple environments) my $api_user = CWM_configDelete->get_api(); my $api_creds = "$api_user->{user}:$api_user->{pass}"; my $get_env = "get_$environment"; my $cwm_cfg = CWM_configDelete->$get_env(); my $cwm_url = $cwm_cfg->{url}; my $cwm_host = $cwm_cfg->{host}; my $cwm_user = $cwm_cfg->{user}; my $cwm_pass = $cwm_cfg->{pass}; my $cwm_db = $cwm_cfg->{DB}; #CMD Line Options GetOptions ("l:s" => \$deleteLimit, "e:s" => \$environment, "t:s" => \$maxThreads, 'debug' => \$debug); #####MAIN##### #Connect to the DB my $dbh = DBI->connect("DBI:mysql:database=$cwm_db;host=$cwm_host","$c +wm_user","$cwm_pass", {'RaiseError' => 1}); #Prepare SELECT statement my $sth = $dbh->prepare("SELECT cmt.serialNumber FROM Device d JOIN Cw +mpManagedTrait cmt on (d.cwmpManagedTrait_id=cmt.id) WHERE cmt.lastin +form < 1000*UNIX_TIMESTAMP(DATE_SUB(now(),INTERVAL 90 DAY)) limit $de +leteLimit"); #Execute Select $sth->execute(); # THREADED ###Parse Fetch and execute DELETE while (@row = $sth->fetchrow_array ) { ###Create and Run threads $active_threads{@row} = threads->create (\&deleteDevice, @row); ###Limit running threads while(scalar threads->list(threads::running) >= $maxThreads) { foreach my $th_key(keys %active_threads) { next unless($active_threads{$th_key}->is_joinable()); $active_threads{$th_key}->join(); delete($active_threads{$th_key}); } sleep(1); } } ###Thread cleanup foreach (threads->list()) { $_->join(); } ### SUBS ### sub deleteDevice($) { my $deviceSN = shift; my $full_url = "https://$api_creds\@$cwm_url$deviceSN"; &debug("CMD: $full_url"); my $threadCount = scalar threads->list(threads::running); #print "Current Thread Count: $threadCount\r"; ## Get current subscriber info my $ua = LWP::UserAgent->new( #ssl_opts => { verify_hostname => 0, SSL_verify_mode => 0x +00 } ); $ua->agent("Device Deletion"); my $req = HTTP::Request->new(DELETE => $full_url); $req->content_type('application/x-www-form-urlencoded'); my $res = $ua->request($req); if($res->is_success) { &logDeviceDetails("$deviceSN -- Delete Successful"); } else { &logDeviceDetails("$deviceSN -- Delete Failed: " . $res->statu +s_line . " - " . $res->content); } } sub logDeviceDetails(@) { # Log Script actions my @deviceLogs = @_; chomp (my $TS = `date +\%H:\%M:\%S`); my $logoutFile = '/var/tmp/CWM/del_reports'; # Append File name $deviceLogDIR = "$logoutFile" . "/subDELreport" . '_' . "$reportDa +te"; # Create log file open FILE, ">>", "$deviceLogDIR" or die "Failed to open $deviceLo +gDIR: $!"; print FILE for "$TS: @deviceLogs\n"; #update file with edits made close FILE; } sub debug(@)#Spit Info { my @pass_data = @_; chomp (my $TS = `date +\%H:\%M:\%S`); if ($debug) { print for "DEBUG:$TS @pass_data\n"; } }

Replies are listed 'Best First'.
Re: Perl Threads
by BrowserUk (Patriarch) on Aug 13, 2015 at 20:01 UTC

    The first problem with your code is that it does not compile. You have an unbalanced paren in line 64.

    And you are calling a subroutine debug in line 47 which doesn't exist.

    And the hash $active_threads{@row} is never declared....along with:

    Global symbol "$deleteLimit" requires explicit package name at C:\test +\1138490.pl line 12. Global symbol "$environment" requires explicit package name at C:\test +\1138490.pl line 12. Global symbol "$maxThreads" requires explicit package name at C:\test\ +1138490.pl line 12. Global symbol "$debug" requires explicit package name at C:\test\11384 +90.pl line 12. Global symbol "$cwm_db" requires explicit package name at C:\test\1138 +490.pl line 15. Global symbol "$cwm_host" requires explicit package name at C:\test\11 +38490.pl line 15. Global symbol "$cwm_user" requires explicit package name at C:\test\11 +38490.pl line 15. Global symbol "$cwm_pass" requires explicit package name at C:\test\11 +38490.pl line 15. Global symbol "$deleteLimit" requires explicit package name at C:\test +\1138490.pl line 18. Global symbol "@row" requires explicit package name at C:\test\1138490 +.pl line 26. Global symbol "%active_threads" requires explicit package name at C:\t +est\1138490.pl line 28. Global symbol "@row" requires explicit package name at C:\test\1138490 +.pl line 28. Global symbol "@row" requires explicit package name at C:\test\1138490 +.pl line 28. Global symbol "$maxThreads" requires explicit package name at C:\test\ +1138490.pl line 31. Global symbol "%active_threads" requires explicit package name at C:\t +est\1138490.pl line 32. Global symbol "%active_threads" requires explicit package name at C:\t +est\1138490.pl line 33. Global symbol "%active_threads" requires explicit package name at C:\t +est\1138490.pl line 34. Global symbol "%active_threads" requires explicit package name at C:\t +est\1138490.pl line 35. Global symbol "$api_creds" requires explicit package name at C:\test\1 +138490.pl line 46. Global symbol "$cwm_url" requires explicit package name at C:\test\113 +8490.pl line 46. Global symbol "$ua" requires explicit package name at C:\test\1138490. +pl line 50. Global symbol "$ua" requires explicit package name at C:\test\1138490. +pl line 55. Global symbol "$deviceLogDIR" requires explicit package name at C:\tes +t\1138490.pl line 73. Global symbol "$reportDate" requires explicit package name at C:\test\ +1138490.pl line 73. Global symbol "$deviceLogDIR" requires explicit package name at C:\tes +t\1138490.pl line 76. Global symbol "$deviceLogDIR" requires explicit package name at C:\tes +t\1138490.pl line 76. Execution of C:\test\1138490.pl aborted due to compilation errors.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

      Hi Yes, I tried to trim it down and leave the relevant parts of the script, I will update with the missing pieces...

Re: Perl Threads
by BrowserUk (Patriarch) on Aug 13, 2015 at 23:33 UTC

    The problem is that you are creating all your threads; then joining them. Although you are limiting the number of running threads; because you aren't joining them; they are all hanging around in their non-running, joinable state; and that it what is consuming your memory.

    Given that you aren't retrieving anything from the threads when you join them, the simple solution is to avoid having to join them, by detaching them at the point of creation.

    Instead of this: $active_threads{ @row } = threads->create ( \&deleteDevice, @row );

    Do this: threads->create( \&deleteDevice, @row )->detach;

    That way the threads will clean themselves up automatically on termination.

    However, there is a problem with this; namely that once your main loop creating the threads finished, you can no longer use:

    foreach( threads->list() ) { $_->join(); }

    to ensure that the main thread doesn't end before the worker threads complete; and unfortunately, the authors of threads didn't provide a mechanism for waiting until all detached threads complete; nor even a way of checking if any detached threads currently exist.

    There are several approaches to dealing with this; some more complicated than others and most have their own sets of problems. The simplest way I know is to use my count-them-out-and-count-them-back method:

    use threads::shared; ## at the top ... my $running : shared = 0; ## before the main loop ... sleep 1 while $running; ## after the main loop in-place of the join lo +op sub deleteDevice($) { ++$running; .... # as now --$running; }

    Add that lot together and your memory problems should go away.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
      Hmm, main thread is detached?
      $ perl -Mthreads -e " threads->detach " Thread already detached at -e line 1. $ perl -Mthreads -e " threads->self->join " Cannot join a detached thread at -e line 1.
        Hmm, main thread is detached?

        How could it be otherwise? What part of the process could join the main thread?

        The main thread, is the process. Its return value can only be retrieved by another process; in which context thread->join doesn't make any sense.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

      So taking your suggestions above, I have updated my scipt as seen below and this does seem to resolve my memory issue - MUCH APPRECIATED!

      my $running : shared = 0; ###THREADED### ##Parse Fetch and execute DELETE while (@row = $sth->fetchrow_array ) { ##Create and Run threads ##Limit running threads sleep 2 while $running >= $maxThreads; threads->create (\&deleteDevice, @row)->detach; } ##Thread cleanup sleep 1 while $running > 0; sub deleteDevice($) { ++$running; my $deviceSN = shift; my $full_url = "https://$api_creds\@$cwm_url$deviceSN"; &debug("CMD: $full_url"); #my $threadCount = scalar threads->list(threads::running); print "Current Thread Count: $running\r"; # Get current subscriber info my $ua = LWP::UserAgent->new( ssl_opts => { verify_hostname => 0, SSL_verify_mode => 0x0 +0 } ); $ua->agent("Device Deletion"); my $req = HTTP::Request->new(DELETE => $full_url); $req->content_type('application/x-www-form-urlencoded'); my $res = $ua->request($req); if($res->is_success) { &logDeviceDetails("$deviceSN -- Delete Successful"); } else { &logDeviceDetails("$deviceSN -- Delete Failed: " . $res->statu +s_line . " - " . $res->content) } --$running; }