comment on

Well, I spent some more time thinking about the problem, and I came up with a slightly better algorithm. Instead of a recursive, depth-first approach, I used an iterative, breadth-first algorithm that I arrived at while trying to optimize my previous attempt into a tail-recursive scheme. Code and benchmarks follow.

#!/usr/bin/perl
use strict; use warnings;
use YAML::XS;
use Benchmark qw(cmpthese);
my $data;
{local $/;$data = <DATA>};

cmpthese( -5, {
        recursive => sub {my $d = Load $data; mergekeys_recursive( $d 
+); },
        tail_call => sub {my $d = Load $data; mergekeys_tail( $d ); },
        loop => sub {my $d = Load $data; mergekeys_loop( $d ); },
    }
);

sub mergekeys_recursive
{
    my ($ref) = @_;
    my $type = ref $ref;
    if ($type eq 'HASH')
    {
        my $tmphref = $ref->{'<<'};
        if ($tmphref)
        {
            die "Merge key does not support merging non-hashmaps"
                unless (ref $tmphref eq 'HASH');
            my %tmphash = %$tmphref;
            delete $ref->{'<<'};
            %$ref = (%tmphash, %$ref);
        }
        mergekeys_recursive($_) for (values %$ref);
    }
    elsif ($type eq 'ARRAY')
    {
        mergekeys_recursive($_) for (@$ref);
    }
    return $ref;
}

sub mergekeys_tail
{
    my ($ref) = (@_);
    _mergekeys($ref);
    return $ref;
}
sub _mergekeys
{
    my $ref = shift or return;
    my $type = ref $ref;
    if ($type eq 'HASH')
    {
        my $tmphref = $ref->{'<<'};
        if ($tmphref)
        {
            die "Merge key does not support merging non-hashmaps"
                unless (ref $tmphref eq 'HASH');
            my %tmphash = %$tmphref;
            delete $ref->{'<<'};
            %$ref = (%tmphash, %$ref);
        }
        push @_, grep {ref eq 'HASH' or ref eq 'ARRAY'} values %$ref;
    }
    elsif ($type eq 'ARRAY')
    {
        push @_, grep {ref eq 'HASH' or ref eq 'ARRAY'} @$ref;
    }
    goto &_mergekeys;
}

sub mergekeys_loop
{
    my ($orig) = @_;
    while (my $ref = shift)
    {
        my $type = ref $ref;
        if ($type eq 'HASH')
        {
            my $tmphref = $ref->{'<<'};
            if ($tmphref)
            {
                die "Merge key does not support merging non-hashmaps"
                    unless (ref $tmphref eq 'HASH');
                my %tmphash = %$tmphref;
                delete $ref->{'<<'};
                %$ref = (%tmphash, %$ref);
            }
            push @_, grep {ref eq 'HASH' or ref eq 'ARRAY'} values %$r
+ef;
        }
        elsif ($type eq 'ARRAY')
        {
            push @_, grep {ref eq 'HASH' or ref eq 'ARRAY'} @$ref;
        }
    }
    return $orig;
}
__DATA__
[download]

Data I use for benchmark:

---
key1: &id05
    - &id02
        name: Curly
        lastname: Howard
        hair: no
        occupation: stooge
    - 
        <<: *id02
        name: Larry
        lastname: Fine
        hair: curly
    - 
        <<: *id02
        name: Moe
        hair: bowl
    -
        <<: *id02
        name: Shemp
        hair: yes
        on_again: off_again
key2:
    subkey_a: 
        - foo
        - phoo
        - ghoo
        -
            - [[[[[[[[[[[[[[asdf]]]]]]],1],3]],{whatever: works}]]]]
            - 
                sounds:
                    - &id03
                        voice: hollow
                        says: plugh
                        colors:
                            - red
                            - green
                            - blue
                    -
                        <<: *id03
                        voice: wind
                        says: do you hear what i hear
                    -
                        <<: *id03
                        voice: stooge
                        says:
                            - nyuk
                            - nyuk
                            - nyuk
                        characters: *id05
                            
            - zxcv
    subkey_b: 
        - &id01
            name: bar
            type: variable
            weather: sunny
        - 
            <<: *id01
            name: baz
            hometown: Perth
...
[download]

And the results of the benchmark:

            Rate recursive tail_call      loop
recursive  951/s        --       -6%      -14%
tail_call 1016/s        7%        --       -9%
loop      1111/s       17%        9%        --
[download]

I was surprised that the loop was that much faster than the tail-recursion, since it amounts to about the same thing (go to top of loop, check condition, execute or return). It is important to note, though, that I was only able to replace the goto with a while loop because my function was intended to be run for its side-effects on the data structure. If I needed to accumulate a return value, the goto would have been the better solution.

print map{(split//,'hark, suPerJacent other l')[$_]}(11,7,6,16,5,1,15,18..23,8..10,24,17,0,12,13,3,14,2,4);

In reply to Re: A fix for merge keys in YAML::XS, YAML::Syck by bv
in thread A fix for merge keys in YAML::XS, YAML::Syck by bv

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.