c has asked for the wisdom of the Perl Monks concerning the following question:

Sorry to be on the simple side, but I've just recently entered the monastery and not shortly before just picked up my first book on perl. I am in something of a healthy discussion with a colleague of mine also new to perl. A need came up for a script to take all files within a directory that have a name of an 10.x.x.x ip address and copy the file with a name of the corresponding hostname. I know that perl's mantra is 'theres more than one way to do it' but his method seems to not take advantage of perl's traits and makes a lot of system calls:
#!/usr/bin/perl -w system("ls -1 10.* > hostfilelist"); open (LIST, "hostfilelist"); push(@entrys, <LIST>); foreach my $entry (@entrys) { open (CURRENT, "$entry"); my @contents = <CURRENT>; @results = grep {/hostname/} @contents; #print $results[0]; @outputname = split (/ / , $results[0]); #print $outputname[1]; system("touch $outputname[1]"); open (OUTPUTFILE, ">$outputname[1]"); print OUTPUTFILE "@contents"; close (LIST); close (CURRENT); close (OUTPUTFILE); unlink ("hostfilelist"); }
my version of the same process is as follows:
#!/usr/bin/perl -w use strict; my @entrys; my $entry; my $host; #to specify the directory, just uncomment the next line and put in the #real path to the dir, then comment out, or erase the opendir line tha +t #references "." #opendir DIR, "/path/to/dir/with/configs" or die "Cant open dir $!\n"; opendir DIR, "." or die "Cant open dir $!\n"; @entrys = readdir DIR; #uncomment the next line if you are specifying a differenet directory #other than the pwd. #chdir "/path/to/dir/with/configs" or die "Cant change directory $!\n" +; foreach my $entry (@entrys) { next if ( $entry !~ /^10/ ); open CURRENT, $entry or die "Cant open file $entry $!\n"; while (<CURRENT>) { $host = $_; if ( $host =~ /^hostname/ ) { $host =~ s/^hostname\s(.*)/$1/; chomp $host; open OUTPUTFILE, ">$host" or die "Cant create file $!\n"; print OUTPUTFILE <CURRENT>; close OUTPUTFILE; } } close CURRENT; }
he argues that my script is longer and takes more time and memory. i see his as making several system calls and multiple arrays. granted both work and neither would rock the processor, but i am really looking for some kind of justification that i am on the right path towards improving what skills i have and thinking about more than just "did it work?".

humbly, -c

Replies are listed 'Best First'.
Re: A question of efficiency
by MZSanford (Curate) on Jul 18, 2001 at 18:20 UTC
    Well, there is more than one way, but as for speed, everytime system() is called a new process is created. This means the Kernel needs to allocate memory, load the executable, open file descriptiors, add the process to the internal scheduler and a slew of other things related to process creation. This all takes time. So, using system() when not needed, i have found, inflicts a hefty penalty. The best test is, as always, the ever present use Benchmark;. I would put them in and give it a whirl ... i am guessing the Pure-Perl version will be faster ... not to even get into the subject of portability, as that would turn into a huge rant :-)
    OH, a sarcasm detector, that’s really useful
Re: A question of efficiency
by premchai21 (Curate) on Jul 18, 2001 at 18:25 UTC
    Good for you! Your colleague's version is unportable because it relies on Unix tools to be there, and depending on the system, creating processes (e.g. with system) can be expensive, CPU-wise / memory-wise. Your version is rather better, staying more within Perl, using the fact that Perl has already figured out how to perform certain operations to your advantage. Though it can still be cleaned up a bit (for instance you could use Getopt::Long to get the directory, and .* is generally ungood (you could use s/^hostname\s// and/or add a + after \s if you want to match one or more whitespace characters (but this is rambling and nested and should be stopped, and so (being the way it is) will be stopped now))), you are well on your way.

    And, c, welcome to Perlmonks.

      I just want to add my two cents and point out that the anti-.* attitude here theatens to install voodoo programming ideas in newcomers...

      .* is often non-optimal, but it is very bad system of choosing which way to do it if you are using dogmatic rules.

      Isn't the point of having many ways to do it the fact that the best way depends on context? Rather than encouraging bondage, I think it is healthier to encourage understanding of the cases where it causes problems.

      In this case you're right that it would be more efficient to replace it, but not because .* is bad... simply because it's more work than is needed. On the bright side, since the regex is anchored the .* is a very small performance hit, not the order of magnitude that is caused by truely awful uses of it.
      --
      Snazzy tagline here

        Note: None of what follows applies to the original poster who started this thread. This is merely my following up on a thought that Aighearach sparked.

        Aighearach wrote:

        I just want to add my two cents and point out that the anti-.* attitude here theatens to install voodoo programming ideas in newcomers...

        I have to confess that I'm rather ambivalent about this. I'm starting to get to the point where I don't even want to bother to point out good programming practices. Obviously, simply saying "don't use dot star" or "you must use strict" is not sufficient. premchai21 did provide a link to back up what was said, but I'm discovering more and more that people don't give a fig about how to program well.

        That raises an interesting question: do we just answer questions for people, or do we take the trouble to care about the quality of answers? Dominus, if I understood him correctly, seems to argue for the just answer the question camp (in a reply to a review that I wrote about Perl and CGI for the World Wide Web). Specifically, he wrote:

        I've heard plenty of arguments that you have to learn these style rules right from the beginning, apparently from people who think that if you once turn down the Path of Darkness your Soul is Lost Forevermore, and I think it's bullshit.

        I differ on this. Have you ever studied chess with an expert? Or the martial arts, tennis, or anything require great skill? One of the most common complaints from experts in various fields is that they are sick and tired of trying to get people to "unlearn" bad habits. Quite often, these people are a lost cause. Teach them right from the beginning and you don't have to fight that fight.

        That being said, so what? I, for one, am getting discouraged at giving answers and being told that I'm over the top (which I may very well be). I constantly see newbies telling more experienced people that they don't want to use strict. They don't need to use warnings. <sarcasm>They know exactly what they are doing but won't we please, please point out the one little bug in their program and then they'll ignore the other advice because they're clearly good enough to do without it anyway.</sarcasm>

        If you check out my posts, you'll notice that I haven't been posting as much lately. That's due, in large part, because of this issue. I'm still showing up and reading the threads because I want (need) to continue to improve my Perl. Plus, I have a lot of friends that I've met here that I want to keep in touch with, but I'm getting burnt out on helping people who don't want to be helped.

        I suppose I'm probably just in a funk right now and I'll snap out of it.

        Cheers,
        Ovid

        Vote for paco!

        Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: A question of efficiency
by bikeNomad (Priest) on Jul 18, 2001 at 18:21 UTC
    Does your script take more time and memory? Remember that you have to count the time and memory used by external processes, as well.

    Also, you should probably use glob() to get the same results, rather than reading all the filenames and filtering them:

    foreach my $entry (glob('10.*')) { }
Re: A question of efficiency
by Hofmator (Curate) on Jul 18, 2001 at 21:12 UTC

    I would write it like this - making some assumptions as your two versions do (slightly) different things:

    #!/usr/bin/perl -w use strict; use File::Copy; while (<>) { if (/^hostname/) { chomp; # substr($_,9) extracts everything after 'hostname ' copy ($ARGV, substr($_, 9)); close(ARGV); } }
    If this script is saved as 'hostname.pl' you call it something like hostname.pl 10.* and let the shell handle the filename globbing.

    -- Hofmator