dbonneville has asked for the wisdom of the Perl Monks concerning the following question:

I built my first Perl application that parses XML and creates a website of over 1000 pages from that XML. The build process is really fast. As a last step, I wanted to take all the generated files and copy them to a new folder and then zip them up for another process to pick up and FTP to another location.

I had trouble getting a recursive copy of everything, and inquired here at perlmonks, where someone suggested, since I was using ActiveState, to use the ActiveState::Handy::cp_tree module.

I installed it and ran it local. It copied all 1000 files to another directory (in the same root folder of the project, same hard drive, etc) in maybe a minute. Great so far!

I took the entire Perl application and put it on a network box so other people can build the site. I mapped a drive (windows of course) to the folder on the network, and executed the program. The entire build process was as fast, if not a little faster than on my laptop.

But...when it came time to execute the last section of the code, where I use cp_tree to copy the newly generated files to an "out" folder, the process completely bogs down.

I can watch the out folder slowly fill up with files, about 1 per second or something like that. Now, the simple copy command is taking about 10-12 minutes from start to finish, whereas on my laptop, it only take about 2 minutes.

Any clue what could be going on? Does Perl behave differently if you invoke it from a mapped drive or network resource? Why would one section of code fly on one machine and be so slow, impossibly slow, on another?

Thanks for any help. I'm brand new to Perl and this is my first project with it.

Thanks...

Doug

  • Comment on Very slow ActiveState::Handy::cp_tree problem

Replies are listed 'Best First'.
Re: Very slow ActiveState::Handy::cp_tree problem
by GrandFather (Saint) on Oct 11, 2007 at 20:44 UTC

    It's more likely that you are hitting performance problems due to shunting files across the network than that Perl is behaving differently in the two contexts. It may be that ActiveState::Handy::cp_tree is doing more work across the network than it need. It would be interesting to benchmark something like xcopy doing the same job.


    Perl is environmentally friendly - it saves trees
      So if I run the script as accessed through a mapped drive (Windows) it will run much slower than if I telnet into the box and run the script?

      I looked up xcopy - sorry I'm a newb still - and noticed in cpan it was under something called geotiger. Is xcopy part of the ActiveState Perl I have running (latest version)?

      Thanks,

      Doug

        xcopy is a Windows command line file copy tool. I was suggesting that a sanity check for the performance of the code would be to manually do the same copy task using Windows system tools to check that the time to do to raw copy was the issue and not some overhead in your script.

        Running the script using a mapped drive implies that the local machine is doing the work and that all the data it manipulates moves across your network connection. Running the script using telnet implies that the script runs on the remote box and all the data is manipulated locally to the remote box. It is the overhead of shunting large amounts of data across the network that I suspect is causing your issue.


        Perl is environmentally friendly - it saves trees