What follows doesn't involve a lot of code, but demonstrates some nice things you can do with Perl's powerful. flexible, concise syntax. It is something I used in a project of mine where I am genearating a lot of C++ code.

The code generates a large number of source and header files. In its first incarnation, each file was just opened and closed as needed, like this:

open SRC "> file"; # print lots of stuff close SRC;

This was fine for a while, but as time went on the thing got slower and slower, as the number of files grew. Also the sounds coming out of the hard disc suggested that some optimisation was in order. Hmm.

So here's a wish list for some improvements;

Here's how it was done:

{ my %fhs; sub END { foreach (keys %fhs) { my $fh = $fhs{$_}; if ($_ =~ /\.h\.new$/) { print $fh "#endif\n"; } close $fh; if ($_ =~ /\.h\.new$/) { Clearcase::handleClearCaseElement($_); } } } sub openFile { my ($file, $header) = @_; unless (defined $fhs{$file}) { my $f; open ($f, "> $file") || die "could not open $file for output $!"; $fhs{$file} = $f; print $f $header; } return $fhs{$file}; } }

It's pretty simple. The END block handles cleanup, adds closing #ENDIFs where necessary, and calls some Clearcase handling routines which take care of version control.. The %fhs hash is private to the END and openFile methods. openFile takes an argument for header info for the file in my system. You could change this around to fit your requirements.

The other code has an easy time now. Whenever a certain file is wanted for output, just call my $fh = openFile('myFilename.cc', $header) and start printing to $fh. Job done.

The script is back down to a few seconds to run as a result of this (from over 30 seconds) and the disc is a lot happier. The code is much more readable too.

It would be wise, if the number of files gets very high, to extend this code to check that the number of open filehandles does not get too close to the system limit. It's possible to see some smart mechanism where least used filehandles are closed and only reopened when requested again ... as I don't need this right now, I'll leave this as an exercise for some future monk ...

Replies are listed 'Best First'.
Re: handling multiple file handles for code generation
by blazar (Canon) on Sep 05, 2005 at 12:05 UTC
    foreach (keys %fhs) { my $fh = $fhs{$_}; if ($_ =~ /\.h\.new$/) { print $fh "#endif\n"; }
    The whole point of $_ is that of being the topicalizer: you either want
    for my $file (keys %fhs) { # ... print $fh whatever if $file =~ /\.\.new$/; }
    or
    for (keys %fhs) { # ... print $fh whatever if /\.\.new$/; }
    but your mixed form doesn't add to code readability, although -of course- it is not illegal. (I also took the liberty of rewriting the if condition as a statement modifier, as IMHO it is clearer that way.)

      I don't understand your point. AFAIK, $_ is just a variable that happens to have a certain well-known value when used inside a loop like this. Of course, using an explicitly named variable will always aid readability (if it is well named) but there is a balance between brevity and "names that mean something".

      So please clarify - what does "topicalizer" mean, and why does it make what I wrote wriong?

      (But thanks for at least making a comment after downvoting ... ;-)

        So please clarify - what does "topicalizer" mean, and why does it make what I wrote wriong?
        It means it acts much like the pronoun "it", that is, it is the implicit argument of many operators and functions. Thus
        $_ =~ /$regex/;
        is always equivalent to just
        /$regex/;
        but the latter is more concise, and is typically idiomatic of Perl. So it is most often more clear. If (you think) it is not, then an explicit variable name may be in order. And in that case you have to do
        $var =~ /$regex/;
        But then, if you have a bunch of matches (or substitutions or ...) to do, people at times even uses a for loop just for its aliasing effect, e.g.:
        s/$rx1/foo/, s/$rx2/bar/, s/$rx3/baz/ for $var;