Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm having a requirement to write a function, that can take a reference, and return me all given object references in that value. The function should do a deep search of the given reference (which could be a hash reference or an object), and return an array of objects. Before I go ahead, I would like to meditate on the Monastry Gates and seek the wisdom of monks; if a solution already exists, or a better solution is available.

Background: I have a requirement, that goes like this:


Object Log, when dumped, looks like this:
$VAR1 = bless({'options' => {'filehandle' => bless( \*Symbol::GEN2, 'FileHandle' ), 'filename' => 'a.log'},}, 'Log' );
Log object provides the methods filehandle() and filename() to set the above values (and ofcourse, a $log->print()). I have a large codebase, that I'm modifying to enable parallel processing of various parts, with minimum changes to the codebase. So the call:
$self->do_task()
becomes:
Process::spawn(\&do_task, $self);
The problem is, $self has references to the log object at various places. I want all the output of the spawned process to go to a new file. So I do something like this:
$new_log = new Log(filename => 'new.log'); $cur_log = $self->{log}; $self->{log} = $new_log; Process::spawn(\&do_task, $self); # This runs in background. $self->{log} = $cur_log; # Restore.
But alas !, $self also has various other references to log, embedded deep inside, like:
$self->{system}->{log}, $self->{filecache}->{system}->{log}
...etc. Since these log objects are created in the parent, at various points in the code, they all refer to the parent filehandle. I cannot change the parent code; So I need to do some gimmick to find out all these references to Log and replace them with $new_log.
So I plan to write something like Devel::ObjSearch, that can be used as:
@ref = Devel::ObjSearch($self, sub { return UNIVERSAL::isa($_[0], 'Log'); }); foreach (@ref) { $_->filename('new.log'); $_->filehandle($new_log); } .. spawn() ..
Anybody can provide a more enlightened solution ? Any hints to how I can develop Devel::ObjSearch ? (I plan to steal code from Data::Dumper, or implement a hook in dumper, if possible).
thanks in advance.

Replies are listed 'Best First'.
Re: searching for an object reference
by Mostly Harmless (Initiate) on Jul 05, 2005 at 05:52 UTC
    Alas, I intended to post this with my ID, but somehow it went under Anonymous Monk.
Re: searching for an object reference
by simonm (Vicar) on Jul 05, 2005 at 18:10 UTC
    Updated: minor code changes and test harness added.
    sub deep_search { my $target = shift; my $test = shift; # Avoid infinite loops by maintaining map of objects encountered local %SearchCache = ( undef => undef ) unless ( exists $SearchCache +{undef} ); return if ( $SearchCache{ $target } ++ ); # Determine data type of unblessed reference or blessed object my $ref = ref( $target ); if ( "$target" =~ /^\Q$ref\E\=([A-Z]+)\(0x[0-9a-f]+\)$/ ) { $ref = $ +1 } ( $test->( $target ) ? $target : () ), ( map { deep_search($_, $test) } grep { ref($_) } ( ($ref eq 'HASH') ? values %$target : ($ref eq 'ARRAY') ? @$target : ($ref eq 'REF' or $ref eq 'SCALAR') ? $$target : () ) ) } my $struct = { foo => bless( [ 0, 1, 2 ], 'Foo' ), bar => bless( [ 3, 4, 5 ], 'Bar' ), foozles => [ { foolish => bless( [ 6, 7, 8 ], 'Foo' ), } ], }; print join '', map "$_\n", deep_search( $struct, sub { ref( $_[0] ) eq + 'Foo' } )
      Thanks. I ended up writing a Data::ObjSearch yes'day. However, your's look simpler. I didn't know about 'REF' and didn't handle it. Otherwise it looks similar:
      package Data::ObjSearch; use strict; use Carp; sub new { my ($class, $sub) = @_; my $self = bless({}, $class); if (ref($sub) eq 'CODE') { $self->{sub} = $sub; } elsif ($sub and !ref($sub)) { $self->{sub} = sub { return UNIVERSAL::isa($_[0], $sub); }; } else { croak "Pass a subroutine reference or package name.\n"; } $self->{records} = []; $self->{seen} = {}; return $self; } sub search { my ($class, $ref, $sub) = @_; my $self = ref($class) ? $class : $class->new($sub); $self->analyze($ref); return @{$self->{records}}; } sub analyze { my ($self, $data) = @_; my $ref = ref($data); return if (!$ref); my ($pack, $type, $id) = $data =~ /^(?:(.*)\=)?([^=]*)\(([^\(]*)\)$/ +; return if ($id && $self->{seen}->{$id}); $pack && ($self->check_and_push($data), $ref = $type); $id && ($self->{seen}->{$id} = 1); if ($ref eq 'HASH') { while (my ($key, $val) = each %{$data}) { $self->analyze($val); +} } elsif ($ref eq 'ARRAY') { map { $self->analyze($_) } @$data; } elsif ($ref eq 'SCALAR') { map { $self->analyze($_) } $$data; } } sub check_and_push { my ($self, $data) = @_; push(@{$self->{records}}, $data) if $self->{sub}->($data); } 1;
      I was thinking of uploading it to CPAN to avoid maintaining too generic code with our application codebase.
        Unless you have some expectaton of subclassing this, the object orientation seems like overkill.

        Also, for general purpose use, UNIVERSAL is probably better than the stringification/regex approach, as it plays better with string overloading.

        A further speed improvement can be gained by using list processing to avoid recursive calling:

        use strict; use UNIVERSAL 'isa'; sub deep_grep (&@) { my $test = shift; my ( @results, %seen ); while ( scalar @_ ) { my $target = shift @_; push @results, grep $test->( $_ ), $target; unshift @_, $seen{$target} ++ ? () : isa($target, 'HASH') ? values %$target : isa($target, 'ARRAY') ? @$target : isa($target, 'REF') ? $$target : isa($target, 'SCALAR') ? $$target : (); } return @results; }

        The prototype allows you to call it blockwise, like the builtin grep, but isn't needed:

        my @loggers = deep_grep( sub { isa( $_, 'Log' ) }, $target ); my @loggers = deep_grep { isa( $_, 'Log' ) } $target;

      Perhaps you might want to attack this from the other end. Instead of deep-searching to find references to the class, just dynamically update the log class to point where you want, overriding only the critical methods:
      package Log; sub new { my ($class) = shift; bless {}, $class; } sub write { print "write old\n"; } sub open { print "open old\n"; } sub close { print "close old\n"; } 1; package NewLog; sub import { for my $sub (qw(new write open)) { no strict 'refs'; *{'Log::'.$sub} = *{'NewLog::'.$sub}; } } sub new { my ($class) = shift; bless {}, $class; } sub write { print "write new\n"; } sub open { print "open new\n"; } 1; use Log; use NewLog; my $log = new Log; $log->open; $log->write; $log->close;
      If you run this, you'll see the following output:
      open new write new close old
      Note that open and write are overridden, and close is not. This incurs less overhead if you've got a lot of Log objects, or if your objects using them are very deeply nested.
        Thanks, that's a good idea. However I can't use it in my case, as there are too many methods in Log.pm to override - they all use couple of instance variables, which should have new values. Ideally this would work for me:
        package Redirector; my $new_obj; # Redirect all methods calls of object A to object B. sub redirect { my $new_obj = shift; my @methods = @_ || 'all'; <instrument the methods so that Redirector::foo is called instead of old_obj::foo> } sub print { my $real_obj = shift; # Conveniently switch the object. $new_obj->print(@_); }
        The issue with the above is keeping track of $new_obj. This needs to be a package variable, and the class itself has to be a singleton - unless a hash with package names as keys. Any ideas ? For now, deepcopying looks the simplest solution to me, as the data to be traversed is small in size. But I think your idea can be generalized to a Redirector CPAN module !