Re: Re: Closures and callbacks...

I'd like to point out that all the forms of read_row presented above are in fact closures. It is closed over the lexical variable @indexes.

nothingmuch's explanation of closures is not correct. A closure does not contain a "copy" of the lexical variable, nor is the variable "kept independently for each instance". Quite the opposite: a closure maintains a reference to a lexical variable outside its scope, and all instances of the closure(s) all refer to the exact same variable that was in existence at the time the closure was created.

For variables and functions which are instantiated at compile time, as in the original post, the situation is very simple. Here's two functions which are both closed over the same variable:

my @indexes = qw (3 0 4 5 6 7);
open INDATA,"< my.dat" or die "my.dat: $!";
sub read_row {
   my $row = <INDATA>;
   return unless defined($row);
   return (split(/\s+/,$row))[@indexes];
}
sub get_indexes {
   return @indexes;
}
[download]

Closures get more useful, but more tricky, when the functions in question are generated at runtime. Here's how I would make the original poster's code more useful by creating the closure on the fly:

use IO::File;

sub process_each_file_line
{
  my( $filename, $preprocess_cb, $process_cb ) = @_;
  my $fh = IO::File->new( "< $filename" ) 
    or die "open $filename for reading: $!";
  while ( <$fh> )
  {
    $process_cb->( $preprocess_cb->() );
  }
}

sub process
{
  my( $filename, @indexes ) = @_;
  my @result;
  process_each_file_line(
    $filename,
    sub { (split)[ @indexes ] },
    sub { local $" = ','; push @result, "test : @_\n"; }
  );
  return @result;
}

# call the function:
print process( "my.dat", qw( 3 0 4 5 6 7 ) );
[download]

Within the process function, the first anonymous sub (sometimes referred to as a "lambda" -- don't ask) is closed over the @indexes variable, and the second one is closed over @result.

Since @indexes and @result are scoped to the process function, new instances of them are created each time process is called. And the two closures that use them have the same lifetimes. What happens if the closures live longer than the variables they're closed over? Perl does the right thing: it keeps the lexical variables alive. It's all just reference-counting, after all. A (rather contrived) example follows.

use IO::File;

sub process_each_file_line
{
  my( $filename, $preprocess_cb, $process_cb ) = @_;
  my $fh = IO::File->new( "< $filename" ) 
    or die "open $filename for reading: $!";
  while ( <$fh> )
  {
    $process_cb->( $preprocess_cb->() );
  }
}

sub make_file_processor
{
  my( $filename, @indexes ) = @_;
  return sub {  # make a closure which takes no args.
    my @result;
    process_each_file_line(
      $filename,
      sub { (split)[ @indexes ] },
      sub { local $" = ','; push @result, "test : @_\n"; }
    );
    return @result;
  }
}

# create the processor for the given input params:
my $processor = make_file_processor( "my.dat", qw( 3 0 4 5 6  7 ) );

# later on, call the closure:
print $processor->();
[download]

In this example, we actually have two levels of closures. The third argument to process_each_file_line (the second lambda) is closed over @result, which exists one lexical level above. But the second argument (the first lambda) is closed over @indexes, which exists two lexical levels above. Not that it's any big deal; perl lets us do this, and takes care of all the nasty reference bookkeeping.

Disclaimer: The above code was not tested before posting, and may contain bugs.

Comment on Re: Re: Closures and callbacks... Select or Download Code