Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Are array based objects fixed length?

by pdcawley (Hermit)
on Jul 26, 2002 at 09:12 UTC ( [id://185472]=perlquestion: print w/replies, xml ) Need Help??

pdcawley has asked for the wisdom of the Perl Monks concerning the following question:

I know the theoretical answer to that -- No, array based objects do not have to have a fixed length. However, some background.

I'm working on yet another object persistence tool, called Pixie (blame Leon Brocard for the name, not me). One of the goals of the pixie project is to make things as simple as possible for the user (but, of course, no simpler). Amongst other things this means we want to supply simple locking.

The idea is that, if you're using a locking pixie, then fetching an object from the store will automatically grab a lock on it. That lock needs to be released at DESTROY time. This level of locking is, of course, simplistic, but it ensures that there should never be two 'active' copies of an object in play, which is generally a good thing.

So, how to handle releasing stuff at DESTROY time?

With a hash based object, things are relatively simple:

package Pixie::ObjectInfo; ... sub DESTROY { ... $self->store->unlock($self); } package Pixie; sub get { my $self = shift; my $object_id = shift; my $obj = $self->really_get($object_id); $obj->{__Pixie} = Pixie::ObjectInfo->new(oid => $object_id); }
So, when the fetched object goes out of scope, so does the Pixie::ObjectInfo it contains, which handles unlocking the object in the database (along with any other cleanup that pixie needs to do). Which is rather cute, though I do say so myself. Note that the above code is rather simplified compared to what's actually in Pixie.

So, what happens when the class is a blessed arrayref? If we could rely on all pixie client classes to use fixed length arrays then it'd be a doddle, we could just do

$obj->[@$obj] = Pixie::Object::Info->new
But that means we have to assume that the array length is fixed and that the class's implementation will never use negative indices into the array.

This seems to be a little to sweeping as assumptions go... There's always the trick of blessing the thing into a new 'pseudo anonymous' class and hanging the behaviour on there, but it's not exactly pretty.

So, what do you think? I'm especially interested in hearing from people who've actually used array based classes in anger. Do you use fixed length arrays? negative indices? Is there a major CPAN class that breaks either of these assumptions?

Replies are listed 'Best First'.
Re: Are array based objects fixed length?
by demerphq (Chancellor) on Jul 26, 2002 at 10:09 UTC
    I have had some experience using arrays as objects. My objects were floating length and I have used two techniques to deal with issues like yours (although not to do with destruction). My experience is that the solution is determined by the way you use the arrays. For instance in one solution my arrays needed to suport push/pop. Thus the "common" elements were stored in fixed addresses starting at 0. In another solution for reasons I dont exactly remember I used negative indexes to reach common data.

    My guess is that you should probably do

    $obj->[0]=Pixie::Object::Info->new()
    After all the first element is always in the same location (versus the last) and this still allows push/pop which is slightly more efficient than shift/unshift.

    But I still say that without knowing the operations you need to perform on your objects (why pick arrays?) there is no way to give a really good answer.

    update
    I think you probably should just use a different route. _Demand_ that any client class must support a given method, lets say $obj->for_pixie(). This method MUST provide the functionality that when called with a value it stores it. When called at all it returns its current value. Something like

    sub for_pixie { my $self=shift; $self->[0]=shift if @_; return $self->[0]; }
    This way if the storage you need is transparent to you, and if the client class doesnt happen to be implemented as an array then all they have to do is change the method. You could even provide a set of "standard" base classes that already have this functionality (and a new() as well) for different implementations. eg Pixie::Client::Array Pixie::Client::Hash Pixie::Client::Filehandle etc...

    Yves / DeMerphq
    ---
    Writing a good benchmark isnt as easy as it might look.

      Um. No. An important design goal of Pixie was that you should be able to throw any object (for only slightly limited values of 'any') you like at pixie and the pixie would do its magic and store it. We already handle hash based objects quite handily, and array based objects that implement an _oid can be dealt with as 'top level' objects. If they don't implement _oid they can only be used in the contents of objects, subject to certain limitations.

      In general, the objects that are stored should never need to know that they've been persisted (they may want to know, but that shouldn't be a requirement). For instance, I have objects stored in Pixie, working quite happily, that have timestamps implemented using Time::Piece. Getting that to work too precisely no extra work on my part (apart from the work of implementing Pixie in the first place of course). Of course, there are CPAN modules that will need work to handle -- Set::Object, for instance, is a very handy class which can't be directly persisted by Pixie (which uses Data::Dumper to do a lot of its heavy lifting). However, we have a way forward, using $Data::Dumper::Freezer and Data::Dumper::Toaster, which should make even 'pathological' objects tractable to being stored (though maybe not to being 'top level' objects), and which we can use to provide hooks to users who need them.

      Hmm... maybe I should prepare a proper medition. Until then, you can take a look at the current (horribly undocumented) state of the Pixie art on CPAN.

Re: Are array based objects fixed length?
by Abigail-II (Bishop) on Jul 26, 2002 at 11:25 UTC
    If I understand it correctly, Pixie is taking a random object and trying to stuff information in it?

    You are doomed, man, you are doomed. But that's your own fault, because you break a cardinal rule of OO programming: encapsulation. Don't poke around in someone elses implementation.

    If you think array based objects are a problem, how do you want to deal with scalar based objects? Or filehandle based objects, like some (all?) of the IO:: classes?

    Of course, you are not free of problems with hash based objects either. Such an object might have already a "__Pixie" attribute. Or it has an overloaded stringify that's going to do something with all the key/value pairs. Not to mention all the existing techniques out there that prevent accessing/setting keys that aren't in a predefined set.

    Abigail

      Oh, I know. The idea is to come up with a sufficiently useful set of assumptions about the stored objects that will offer a useful set of behaviours. And those assumptions will be documented so users will be aware of the issues. The next step after that is to provide hooks so objects can, if necessary be complicit with Pixie.

      Encapsulation is overrated anyway. Hell, even Smalltalk, the granddaddy of OO systems, provides tools (inspectors, browsers...) to get round encapsulation.

Re: Are array based objects fixed length?
by Joost (Canon) on Jul 26, 2002 at 10:26 UTC
    This is just a completely untested idea but the first solution that comes to my mind is to store the target object's array inside a wrapping object using tie, that way, if the targeted object is deleted, the tied array's DESTROY method is called.

    The code will probably not be a pretty, but that way you might be able to add 'invisible' data (or objects) to another object.

    Hope this helps.

    -- Joost downtime n. The period during which a system is error-free and immune from user input.
      The problem with that approach is that it'd be horribly, horribly slow. The anonymous classes that clean up after themselves trick is a pain to program, but it's a one time cost paid only when the object is first fetched. The tied object cost is paid every time someone accesses the array.
        The problem with that approach is that it'd be horribly, horribly slow.

        Yes it will be :-)

        Maybe you could insert a DESTROY method into the targeted object, possibly aliasing any existing ones, and do the cleanup there?

        Something like:

        if (ClassName->can('DESTROY') { *ClassName::OLD_DESTROY = \&ClassName::DESTROY; *ClassName::DESTROY = sub { cleanup_object($_[0]); goto &$_[0]->DESTROY; } } else { *ClassName::DESTROY = \&cleanup_object($_[0]); }

        Update: this will not get you the object id... hmmm...

        Joost frowns, going to get another cup of coffee

        -- Joost downtime n. The period during which a system is error-free and immune from user input.
An alternative approach using weak references (Re: Are array based objects fixed length?)
by robin (Chaplain) on Jul 26, 2002 at 13:33 UTC
    From your description, I think your actual problem is to find a way of preventing the same (database) object from having more than one "live" realisation at a time.

    I agree with those other posters who believe that your suggested approach is doomed, because it makes too many assumptions about the way that objects - especially array-based objects - might be used. I suggest taking a different tack, outlined below.

    Keep a hash in the Pixie package, let's call it %objects, which uses the object ID number as a key and has a weak reference to the allocated object as its value. Whenever anybody calls get, the first thing you do is to iterate through your hash looking for undef values; if you find one, unlock the object in the database.

    One disadvantage of this approach is that objects won't be released immediately they go out of scope. Instead it will happen the next time somebody calls get. I'm not sure how much of a problem that is for you. Another potential problem is that scanning the hash might become slow if large numbers of objects have been allocated. Obviously you can trade these two problems off against each other: at one extreme, if timely release is not required, you could not bother to release objects at all until exit time. The hash would then just be used to check that no instance of the object is already in existence. Somewhere in between the two extremes, you could do a "garbage collection" run whenever the number of allocated objects exceeds a certain number.

    (In a multithreaded setting, one can also imagine a JVM-like approach where a background thread periodically checks for expired objects.)

    I hope I haven't misunderstood the problem.

    .robin.

      Nice idea. I think I want more immediacy though, otherwise there could be problems with multiple processes where one process takes forever to release the lock, and a pile of other processes hang on that. However, a combined approach using blessing into custom classes might work (modulo the problem with spooky inaction at a distance, where one has to worry about both the representation of an object (hash, array, whatever) and whether it has overloading or not.)

      Time to lobby Hugo about some things for perl 5.9...

Re: Are array based objects fixed length?
by BrentDax (Hermit) on Jul 26, 2002 at 13:18 UTC
    I'd suggest that in the general case you use a Pixie::Array wrapper with an overloaded @{} operator. The overloaded @{} would return the host object's data; Pixie would know how to get its stuff out of a Pixie::Array.

    Or something like that. It's a pretty hard problem. *shrugs*

    <thought type="evil">Could you set up an overloaded @{} in UNIVERSAL that would implement this behavior? That would solve the problem of Pixie::Array's blessing. Alternatively, Pixie::Arrays could keep track of their "real" blessing and redispatch:

    package Pixie::Array; sub PIXIE_getoid { return shift->{oid} } # ... sub AUTOLOAD { return shift->{realobj}->$AUTOLOAD(@_) }
    </thought>

    =cut
    --Brent Dax
    There is no sig.

      The problem with overloading @{} in UNIVERSAL is what happens when one of the storedclasses has overloading in place?

      I have already discovered to my cost that reblessing and overloading do not play well together; you get all sorts of painful spooky inaction at a distance (ie: The spooky action at a distance you were hoping would happen doesn't).

      However, my boss recently played a blinder when he had the idea of using XS to hang sv_magic off the object which, (after a certain amount of buggering about due to problems with sv_unmagic being a little, well, eager) combined with a weakref back to the object for DESTROY convenience gives us what we need. And, marvellously, it dramatically simplifies our object tracking and caching code and removes the need to rebless into a 'managed' class. Now the only reblessing we need to worry about is when a proxy object gets restored.

      This is probably not the place for a Pixie walkthrough though. I'll see about preparing a meditation on some of the tricks we get up to in order to 'just work' for the client.

Re: Are array based objects fixed length?
by hakkr (Chaplain) on Jul 26, 2002 at 12:47 UTC
    You can used pseudo hashes to get fixed length arrays with no auto vivication.
    my $p_hash=[{key=>index,key1=>index1},'value key', 'value key1'];
    derefed by
    $p_hash->[index]; or $p_hash->{key}
    I Have just started using them and 'use fields' to get faster object access where I had big hash objects.
      I think you may have missed my point. Pixie is trying to work with whatever object representation is thrown at it, we've already made the choice of which representation(s) to use within Pixie itself.
Re: Are array based objects fixed length?
by cephas (Pilgrim) on Jul 26, 2002 at 13:58 UTC
    Something like this untested code might work:

    my $glob = eval "*" . ref($obj) . "\::DESTROY"; my $save = *$glob{CODE}; my $newsub = sub { warn("NEW DESTROY CALLED\n"); goto &$save }; no strict; *{ref($obj) . "\::DESTROY"} = $newsub; use strict;

    Which stashes away the object package's DESTROY method, replaces it with a new one which in turn calls the old one. Hopefully I'm making sense. Let me know if I'm mumbling incoherently.

    cephas
Re: Are array based objects fixed length?
by theorbtwo (Prior) on Jul 26, 2002 at 15:55 UTC

    I smell Attributes in your future. You should be able to set an attribute on the object being persisted that is a Pixie::Object, if I understand attributes correctly (which I may not). (To be a little more purticular, I think you want an attribute on the array, not the (blessed) ref to it.)


    Confession: It does an Immortal Body good.

Re: Are array based objects fixed length?
by sauoq (Abbot) on Jul 28, 2002 at 21:23 UTC
    Are you really mussing around in the internals of instances of other peoples' classes? Have you no respect?! :-)

    Unless you can get your class to read the documentation of those modules it is futzing with and modify its own behaviour based on what it learns, I suggest you reconsider your design.

    Maybe I'm missing something but I don't understand why you aren't just creating containers for the objects that you store rather than modifying the objects themselves. Isn't that how generalized persistence is usually implemented?

    Tacking locking onto that is relatively easy. I used FreezeThaw and BerkeleyDB to do exactly this for in-house use. I've since upgraded it to use Storable.

    Good luck.

    -sauoq
    "My two cents aren't worth a dime.";
    
      Are you really mussing around in the internals of instances of other peoples' classes? Have you no respect?! :-)
      Yes, and of course I do. I respect other people enough that I want to provide a tool that just works, for a sufficiently large number of client classes.

      To do this, I have to futz with object internals. It's easy enough with hashes; one can just add a 'safely named' key to the hash and document what you've done, and the vast majority of code will continue to work happily. And if it doesn't, you provide hooks to allow the user to adapt his code so that it will work transparently. (Easy things easy, hard things possible).

      It's less easy with other object representations (that haven't taken advantage of the provided hooks that is). The aim is to make it as easy as possible. I can see ways forward, but I'll have to jump through more hoops to reach my destination. Though thinking about it, I'll probably have to jump through those hoops anyway if I want to support, say, regex based objects...

      Why am I doing this? Well, all the right reasons. Laziness, Impatience and Hubris. I don't like repeating myself, either in building schemas, reimplementing yet another simple minded container, or solving the 'Fetching the World' problem for the nth time. I'm not going to spend my time waiting for someone else to solve my problem. And I'm confident that approach to this that James and I have cooked up is better than all the other approaches out there.

      Pixie is about reducing the hoopage that a Pixie user has to deal with to make her objects persistent. If that means that we (Pixie's implementors) have to deal with way more hoopage, that's okay; if (when) we do it right, it'll mean that nobody else has to jump through those particular hoops again.

        Yes, and of course I do. I respect other people enough that I want to provide a tool that just works, for a sufficiently large number of client classes.

        If you only care about "a sufficiently large number of client classes" anyway, maybe you should quit with what you have. Most classes are probably implemented as refs to hashes that don't include your 'safely named key', don't rely on the number of key/value pairs they contain, don't iterate through their own keys, don't use closures to privatize data, etc. Pretty much everything from there on out is a special case.

        I noticed you mentioned Smalltalk in a reply to Abigail-II. The tools you described are development tools. They are designed to help a developer look into instance internals. That's a necessity. There is obviously a significant difference between looking at internals during development and changing them during runtime.

        Good luck (and may the OO gods have mercy on you!)

        -sauoq
        "My two cents aren't worth a dime.";
        

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://185472]
Approved by broquaint
Front-paged by wil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2024-04-20 00:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found