in reply to Re: Threads problem.
in thread Threads problem.

None the above. And a massive apology if I wasn’t clear.

In trying to find away to improve the speed of sizing the shares - On some servers I have it could take days – I was hoping to find away to at least fire few instances to speed things up. This is, rather than garbing one share at a time, and only move on to the next when the one in hand is done. Beside the size of the folder/share, I needed the number of files and folders also.

This process took twice as long as doing the same operation manually (right clicking the folder and selecting all and the properties). So I thought threads might come in handy.
I attempted to do some thing about it few weeks’ back but hit a brick wall, because of the nature of OLE that doesn’t allow re-entry and as many Monks here suggested, I abandoned the whole thing.

Then, I had an idea, instead of having the OLE operations performed within the main script (no re-entry!), then by calling another scripts where the OLE operations can be performed, I can over come this limitation. It worked. I managed to get at least 7 folders to be processed concurrently – huge improvement in comparison, the time it took to do all seven shares was as long as only doing the largest one out of those seven folders.

However, where the script failed is when I attempted to include the number of files and folder. Monk Particle in response to my other post suggested that I drop OLE and use File::Find::Rule, and thankfully a code was also provided.
I bench marked both approaches and they both about the same speed, but I feel using file::find::rule is little bit more reliable, and contains less code.

This about sums up my reasons for these attempts with multithreading techniques. I will try to thread more than one folder using file::find::rule to obtain folder stats, I hope this would improve things ,…even a little bit speed improvements would be great aid.

You also mention that you see few things that might improve the speed my the above code, I would be very grateful for you comments, Kind Sir.

Thanks

Replies are listed 'Best First'.
Re: Re: Re: Threads problem.
by BrowserUk (Patriarch) on Oct 07, 2003 at 01:03 UTC

    Try this.

    #! perl -slw use strict; use threads qw[ yield ]; use threads::shared; use Thread::Queue; use Win32::OLE qw[ in with ]; my $Qwork = new Thread::Queue; my $Qresults = new Thread::Queue; sub worker { my $fso = Win32::OLE->new( 'Scripting.FileSystemObject' ); Win32::OLE->Option( Warn => 0 ); while( my $folder = $Qwork->dequeue ) { last if $folder eq 'DIE_NOW'; my @folders = $fso->GetFolder( $folder ); my $size = $folders[0]->size() || 'Permission denied'; my( $cFiles, $cFolders ) = (0) x 2; while( my $sub = pop @folders ) { $cFiles += $sub->Files->Count || 0; $cFolders += $sub->SubFolders->Count || 0; for my $subsub ( in $sub->SubFolders ) { $cFolders += $subsub->Files->Count || 0; push @folders, $_ for in $subsub->SubFolders; $cFolders += $subsub->SubFolders->Count || 0; } } $Qresults->enqueue( "$folder - size: $size Files: $cFiles Fold +ers: $cFolders" ); yield unless $Qwork->pending; } undef $fso; return 1; } my @threads = map{ threads->new( \&worker ) } 1 .. @ARGV; $Qwork->enqueue( @ARGV ); yield until $Qresults->pending(); for( 1 .. @threads ) { print $Qresults->dequeue; } $Qwork->enqueue( 'DIE_NOW' ) for 1 .. @threads; $_->join for @threads;

    Supply the directorys/shares to be sized on the command line and it will print out lines like

    P:\test>296404 M:\ S:\ C:\test C:\test - size: 22582263 Files: 7611 Folders: 352 M:\ - size: 47353164 Files: 995 Folders: 1421 S:\ - size: 99468262 Files: 1994 Folders: 3437

    Not very neat, but all the info is there.

    Although you cannot share an instance of Win32::OLE between threads, you can safely use a separate instance on each thread.

    You might also look up the Win32::OLE->Option() class method and the Warn => n setting. This allowed me to do away with the eval wrappers you had around the OLE stuff which probably wasn;t helping your performance. Though the cost of the eval itself was probably minimal, avoiding the calls to Carp/Caller/stack trace generation is worth having.

    What it does is spawns a separate thread for each share to be scanned, posts the targets on a the work queue and waits for the results to come back via the results queue and then prints them out. This allows the scans of the separate machines (and the associated IOWaits) to be overlapped.

    The process consumes around .7MB of ram for each path with a 4 MB startup cost, so you should be good for a hundred scans simulataneaously. If you have more than that, it is an easy change to queue the first 100 and then queue another as each completes until your done.

    I wouldn't normally advocate creating anything like this number of threads, but in this case, as each thread is essentially just sitting in a IO block waiting on other machines to respond, it makes sense to overlap as much of that waiting as possible.

    Let me know how you get on please...


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail

      Thanks for your reply,

      I am sure I am doing something silly here, but when I run the script and supplying c:\ as the only argument, I get this error in return;
      C:\Scripts>pm_threads_ex4.pl c:\ Free to wrong pool 1659fb0 not 15e02e0 during global destruction.
      I am not sure yet on how to proceed from this point on.

      Cheers.

        I think that was a bug in 5.8.0 that was fixed for 5.8.1. I will take a look later and try and find the reference at perl.org (I have to go now for an hour or so).

        If memory serves me correctly, the error only occurs as the program terminates and cleans up the threads, and doesn't actually effect the runtime operation of the program. I seem to recall resorting to adding POSIX::_exit() at the end of the program to avoid the problem occuring.

        I'll take another look later and try to confirm my memory.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail

        Despite my best efforts, I can't locate the patch that fixed the "Free to wrong pool" error during thread cleanup. I found this reference plus some others, but not the patch. I find it pretty difficult to find anything on the myriad perl.org lists, archives, news feeds and RT system. Maybe I just go about it the wrong way?

        Anyway, you can work around the problem by replacing the line

        $_->join for @threads;

        with

        use POSIX qw[_exit]; _exit 0;

        This simply bypasses the thread cleanup code in perl and leaves the OS to clean up the mess, which it does without the associated message and segfault.

        You could also upgrade to 5.8.1 but that means building perl, plus Win32::OLE yourself which is non-trivial, and the _exit() "fix" is simple and effective.

        HTH.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail

      Dear Friends,

      Ok this what I have done; it seems to work up to the point were I engage the threading routines. I get a hell load of Tk related errors - although, the TK side of thing works fine without the threading being involved!

      All my troubles are in the size_info sub - the Tk progress bit at the beginning, which also doesn’t work here but works fine on its own in another example script – therefore your help, enlightening comments (or code) and divine guidance are highly appreciated.



      Thanks in advance
      use strict; use warnings "all"; use diagnostics; use Win32::Lanman; use Win32::OLE qw[ in with ];; use Win32; use Tk; use Tk::ProgressBar; use threads qw[ yield ]; use threads::shared; use Thread::Queue; use POSIX qw[_exit]; use vars qw/%Data %Tk %App %threads @ShareLst/; my $Qwork = new Thread::Queue; my $Qresults = new Thread::Queue; my $max = $ARGV[0] || 20; sub syntax { $Data{script} = $0; my ( $line ) = "^" x length ( $Data{script} ); system ("CLS"); print <<EOT; User requested help or invalid operating option(s) were detected. $Data{script} $line syntax......lablabla EOT exit( ); } sub get_shares { $Data{count1} = 0; print "\n.....Please wait\n"; if (Win32::Lanman::NetShareEnum($Data{server}, \ my @EnumLst)) { for my $Share (@EnumLst) { next if ( ($Share->{path}) =~ /^c|^[d-z]:\\$/i); next if ( ($Share->{netname})=~ /ipc\$|rpc\$|netlogon\$|ad +min\$|^CDROM|^\w\$/i); $~ = "F1"; write; push (@ShareLst, $Share); format F1 = @<<<<<< @<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< +<<<<<<<<<<< ++$Data{count},$Share->{netname},$Share->{path} . } print "\nTotal number of shares = $Data{count} (out of $#EnumL +st)\n"; } else { Win32::MsgBox(Win32::Lanman::GetLastError(),31,"Share Access E +rror"); syntax(); } } sub prep_OLE { $App{row} =1; $App{col} = 1; $App{name} = 'Excel.Application'; $App{xls} = Win32::OLE -> GetActiveObject ( $App{name} ) || Win32: +:OLE -> new ( $App{name} ); die "\nCannot start $App{name}! : $!\n" unless $App{xls}; $App{xls}->{Visible} = 1; $App{book} = $App{xls} -> Workbooks -> Add ( ); $App{sheet} = $App{xls}-> Worksheets ( "Sheet1" ); $App{sheet}-> Columns ("A:A")->{ColumnWidth} = "20"; $App{sheet}-> Cells(1,1)->Font->{FontStyle}='bold'; $App{sheet}-> Cells(1,1)->{value}='Share Net Name'; $App{sheet}-> Columns ("B:B")->{ColumnWidth} = "35"; $App{sheet}-> Cells(1,2)->Font->{FontStyle}='bold'; $App{sheet}-> Cells(1,2)->{value}='Path'; $App{sheet}-> Columns ("C:C")->{ColumnWidth} = "15"; $App{sheet}-> Cells(1,3)->Font->{FontStyle}='bold'; $App{sheet}-> Cells(1,3)->{value}='Size'; $App{sheet}-> Columns ("D:D")->{ColumnWidth} = "15"; $App{sheet}-> Cells(1,4)->Font->{FontStyle}='bold'; $App{sheet}-> Cells(1,4)->{value}='Num of Files'; $App{sheet}-> Columns ("E:E")->{ColumnWidth} = "15"; $App{sheet}-> Cells(1,5)->Font->{FontStyle}='bold'; $App{sheet}-> Cells(1,5)->{value}='Num of Folders'; $App{sheet}-> Columns ("F:F")->{ColumnWidth} = "30"; $App{sheet}-> Cells(1,6)->Font->{FontStyle}='bold'; $App{sheet}-> Cells(1,6)->{value}='Remarks'; $App{xls}-> ActiveWorkbook -> SaveAs( $Data{server}); } sub worker { my $fso = Win32::OLE->new( 'Scripting.FileSystemObject' ); Win32::OLE->Option( Warn => 0 ); while( my $folder = $Qwork->dequeue ) { last if $folder eq 'DIE_NOW'; my @folders = $fso->GetFolder( $folder ); my $size = $folders[0]->size() || 'Permission denied'; my( $cFiles, $cFolders ) = (0) x 2; while( my $sub = pop @folders ) { $cFiles += $sub->Files->Count || 0; $cFolders += $sub->SubFolders->Count || 0; for my $subsub ( in $sub->SubFolders ) { $cFolders += $subsub->Files->Count || 0; push @folders, $_ for in $subsub->SubFolders; $cFolders += $subsub->SubFolders->Count || 0; } } $Qresults->enqueue( "$folder - size: $size Files: $cFiles Fold +ers: $cFolders" ); yield unless $Qwork->pending; } undef $fso; return 1; } sub size_info { print"\nRetrieving size information;\n\n"; my $Share; $App{row} =2; $Data{count2} =0; my $NumofShr = 1 + $#ShareLst ; while (@ShareLst) { #$Tk{MW}= MainWindow->new (-title => "$Data{server}"); #$Tk{UpperFrame} = $Tk{MW}->Frame; #$Tk{UpperFrame}->pack (qw/-side top -fill both -padx 5 -pady +5/); #$Tk{progress}->pack(-fill => 'x'); my @PathLst = (); for (1..$max) { last if ($Data{count2} == $NumofShr); $Share = shift @ShareLst; $App{col}=1; $App{sheet}-> Cells ($App{row}, $App{col}++)-> {value}=$Sh +are->{netname}; $App{sheet}-> Cells ($App{row}++, $App{col}++)-> {value}=$ +Share->{path}; $Share->{path} =~ s/:/\$/; my $Target = "\\\\" . $Data{server} . "\\" . $Share->{path +}; print ++$Data{count2} . ") $Target\n"; push (@PathLst, $Target); } my @threads = map{ threads->new( \&worker ) } 1 .. @PathLst; $Qwork->enqueue( @PathLst ); yield until $Qresults->pending(); for( 1 .. @threads ) { print $Qresults->dequeue; } $Qwork->enqueue( 'DIE_NOW' ) for 1 .. @threads; _exit 0; } } $Tk{MW}= MainWindow->new (-title => "$0: Processing "); $Tk{UpperFrame} = $Tk{MW}->Frame; $Tk{LowerFrame} = $Tk{MW}->Frame; $Tk{FieldName} = $Tk{UpperFrame}->Label(-text=>'Server Name'); $Tk{Entry} = $Tk{UpperFrame}->Entry( -textvariable=> \ $Data{server}); $Tk{GoButton} = $Tk{LowerFrame}->Button( -text=>'Go', -borderwidth=>5, -width=>10, -command=>sub { syntax( ) if ($Data{server} eq "") +; system("cls"); + print "\n\nObtaining $Data{server} + share information.\n\n"; $Tk{MW}->destroy; get_shares; prep_OLE; + size_info; $App{xls}-> ActiveWorkbook -> Save +As( $Data{server}); Win32::MsgBox("Operation completed + successfully",32,"Share Info"); exit( ); }); $Tk{progress} = $Tk{UpperFrame}->ProgressBar( -length => 200, -width => 20, -anchor => 'w', -borderwidth => 1, -highlightthickness = +> 1, -from => 0, -to => 100, -blocks => 100, -colors => [ 0, 'navy', + ], -variable => \$Tk{percen +t_done}); $Tk{UpperFrame}->pack (qw/-side top -fill both -padx 5 -pady 5/); $Tk{LowerFrame}->pack (qw/-side bottom -fill both -padx 5 -pady 5/); $Tk{FieldName}->pack (qw/-side left/); $Tk{Entry}->pack (qw/-side right/); $Tk{GoButton}->pack (qw/-side bottom -fill both -padx 5 -pady 5/); MainLoop; exit 1;
      Thanks in advance

      Humbled by your knowledge,