Re^3: Unicode::UCD=charprop and the speed of various keys

I encourage you, if you are going to go with memoization, to create a more robust wrapper for Unicode::UCD instead of using the global monkeypatching technique. This technique is fragile because it modifies a subroutine's behavior that might be shared / used by other consumers of Unicode::UCD. For example, you might be writing module Foo, which uses MyUnicodeUCD, which monkeypatches Unicode::UCD. But you might also be using module Bar from CPAN (names are made up to protect the innocent). Maybe module Bar also uses Unicode::UCD. Your monkeypatching would propagate back to alter Unicode::UCD for all callers, including Bar which isn't expecting modified behavior.

Memoization is probably pretty innocuous in this case -- you're unlikely to fill all of available memory by memoizing those calls even if there is some other consumer of the function elsewhere in your code base. But it's not generally a great practice to do that. Creating a package that exposes functions that are thin wrappers around Unicode::UCD could be a better solution, as you could make any sub call that invokes the expensive subroutine handle the assignment to typeglob in local terms. You could do something like this, for example:

BEGIN {
    Unicode::UCD->import('prop_invmap');
    memoize 'prop_invmap';
}

sub charinfo {
    my ($self, $arg) = @_;

    local *Unicode::UCD::prop_invmap = \&prop_invmap;
    return Unicode::UCD::charinfo($arg);
}
[download]

With this strategy the memoized sub is injected into Unicode::UCD only for the duration of the call to charinfo, and then Unicode::UCD reverts to original behavior when charinfo's scope ends. You would possibly want to do this for each sub from Unicode::UCD that uses prop_invmap. There are several. For the rest of your MyUnicodeUCD you would just import the original subroutine into MyUnicodeUCD's namespace where it should be able to work without writing a wrapper.

All of this is still a little fragile, as it depends on nothing really changing in the interface for Unicode::UCD, nor in the implementation of functions that call prop_invmap. But for a specific use case, it could be just fine.

Dave

Comment on Re^3: Unicode::UCD=charprop and the speed of various keys Select or Download Code