Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Ending up with duplicate keys in a hash?\

by coffeemaster1 (Initiate)
on Jan 10, 2016 at 03:32 UTC ( [id://1152414]=perlquestion: print w/replies, xml ) Need Help??

coffeemaster1 has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys, Recently made a user account, but been using Perl for a while and always come here for help. I am currently working on a novelty project for myself mostly where I am using the Asterisk AMI to read events and essentially create an ASCII/Perl based contact center. Most of this is going well, but I noticed something very odd while debugging earlier today. I am using threads as well as threads shared for the event parsing/main hash updating and I can't tell exactly what happened here. The steps that produce the data I am going to paste are as follows: 1). Event is read from the Asterisk AMI socket as an array:
while (1){ my @event = $astman->read_response; chomp(@event);
2). It is then categorized/sanitized in a series of essentially inconsequential conditionals and is converted into a hash for easier parsing (at least by my current method) later:
lock(%queue_changes); $event_id = int(rand(25000)); %type = &event_hasher(\@event);
event_hasher:
sub event_hasher{ my @cur_event = @{$_[0]}; my %event_hash; foreach(@cur_event){ $_ =~ s/://g; my @place_holder = split / /, $_; $event_hash{$place_holder[0]} = $place_holder[1]; } return %event_hash; }
3). It is then put in a shared hash which consists of changes and is then read by the main thread which will update the changes:
$queue_changes{$event_id} = %type;
It seems that somewhere between the event being "hashed" and the newly "hashed" event being put into the hash of changes, some of the event hash keys were duplicated (at least according to Data::Dumper) as you can see below (removing some personal details): Initial Event:
Sat Jan 9 21:10:50 2016 1452391850.044984517 Event: $VAR1 = [ 'Event: QueueMemberStatus', 'Privilege: agent,all', 'Queue: test-q', 'Location: SIP/EXTENSION-XXXX', 'MemberName: SIP/EXTENSION-XXXX', 'Membership: dynamic', 'Penalty: 0', 'CallsTaken: 1', 'LastCall: 1452391850', 'Status: 1', 'Paused: 0' ];
"Hashed" Event:
Sat Jan 9 21:10:52 2016 1452391852.284037579 Hashed Version: $VAR1 = { 'Status' => '1', 'Queue' => 'test-q', 'LastCall' => '1452391850', 'CallsTaken' => '1', 'MemberName' => 'SIP/EXTENSION-XXXX', 'Location' => 'SIP/EXTENSION-XXXX', 'Event' => 'QueueMemberStatus', 'Privilege' => 'agent,all', 'Paused' => '0', 'Membership' => 'dynamic' };
"Queue Changes" Hash right after this event is added:
Queue Changes after Event Hashed: $VAR1 = { '9555' => { 'Event' => 'QueueMemberStatus', 'Privilege' => 'agent,all', 'Status' => '1', 'Queue' => 'test-q', 'Membership' => 'dynamic', 'Paused' => '0', 'MemberName' => 'SIP/EXTENSION-XXXX', 'CallsTaken' => '1', 'LastCall' => '1452391850', 'Location' => 'SIP/EXTENSION-XXXX' } };
This is where stuff get a little weird, this would be the same variable from the perspective of the main thread, right after is detected a change to the "Queue Changes" hash and went to update the main hash of agents:
Sat Jan 9 21:10:52 2016 1452391852.286708251 About to make Change: $VAR1 = { '9555' => { 'Event' => 'QueueMemberStatus', 'Queue' => 'test-q', 'MemberName' => 'SIP/EXTENSION-XXXX', 'LastCall' => '1452391850', 'Event' => 'QueueMemberStatus', 'Privilege' => 'agent,all', 'Status' => '1', 'Queue' => 'test-q', 'Membership' => 'dynamic', 'Paused' => '0', 'MemberName' => 'SIP/EXTENSION-XXXX', 'CallsTaken' => '1', 'LastCall' => '1452391850', 'Location' => 'SIP/EXTENSION-XXXX' } };
You can see there are multiple duplicate keys here and I wasn't sure if this was just Data::Dumper printing things out weird as this is a shared variable, or it was actually like this. Is this possible in Perl? I assumed that if it were to happen the "second" key/value pair would overwrite the first. Could this be caused by the use of a shared variable that is being update in one thread and read in another? (I am using locks when reading and writing to the variable). Just wanted to see if anyone had any insight in to what could be going on here. Also, if needed:
perl -v This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi Copyright 1987-2009, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5 source ki +t. Complete documentation for Perl, including FAQ lists, should be found +on this system using "man perl" or "perldoc perl". If you have access to + the Internet, point your browser at http://www.perl.org/, the Perl Home Pa +ge.
Thanks!

Replies are listed 'Best First'.
Re: Ending up with duplicate keys in a hash?\
by choroba (Cardinal) on Jan 10, 2016 at 10:08 UTC
    Can you verify there are no hidden characters in the hash keys?
    #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my %h = ( "Event" => 1, "Event\0" => 2, ); print Dumper \%h; # Oops! $Data::Dumper::Useqq = 1; print Dumper \%h;
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
        Yep Data::Dumper is a good library (thanks) :)

        But you can also use Data::Printer which can colored chars, limit the depth levels, set the indent size, multilines...
        There are a lot of options.

        https://metacpan.org/pod/Data::Printer

        Use p instead of print/say.
        p $href;
Re: Ending up with duplicate keys in a hash?
by Athanasius (Archbishop) on Jan 10, 2016 at 06:27 UTC

    Hello coffeemaster1, and welcome to the Monastery!

    You can see there are multiple duplicate keys here and I wasn't sure if this was just Data::Dumper printing things out weird as this is a shared variable, or it was actually like this. Is this possible in Perl?

    No, absolutely not! It’s a sign that something in your code is seriously broken. :-(

    I assumed that if it were to happen the "second" key/value pair would overwrite the first.

    Yes, that’s exactly what does happen:

    15:57 >perl -MData::Dumper -wE "my %h = (a => 7, b => 13); $h{b} = 42; + print Dumper(\%h);" $VAR1 = { 'b' => 42, 'a' => 7 }; 15:57 >
    Could this be caused by the use of a shared variable that is being update in one thread and read in another? (I am using locks when reading and writing to the variable).

    You are aware that the Perl lock command is advisory only? In your code you lock(%queue_changes);, but you don’t show any corresponding calls to cond_wait, cond_timedwait, cond_signal, or cond_broadcast to make use of the lock. (See threads::shared.)

    You should provide a short, self-contained script which exhibits the anomalous behaviour you’re observing. This will help the monks to help you. But it’s likely that the exercise of producing this script will itself enlighten you as to what’s going wrong in your code.

    BTW, this line is wrong:

    $queue_changes{$event_id} = %type;

    as it puts %type into scalar context and thereby assigns to $queue_changes{$event_id} a fraction such as 6/8.1 I assume you meant:

    $queue_changes{$event_id} = \%type;

    Another point: this code:

    $event_id = int(rand(25000));

    is not guaranteed to produce unique event IDs. For that, you would need to keep track of all the IDs already allocated and re-calculate whenever a new ID duplicates an existing one.

    1The fraction denotes the number of buckets used in the hash divided by the number of buckets allocated.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Ending up with duplicate keys in a hash?\
by betterworld (Curate) on Jan 10, 2016 at 22:56 UTC

    Just a little detail I noticed... I am not sure if this is relevant or maybe even intended, but your event_hasher function has non-obvious side effects:

    $_ =~ s/://g;

    This will modify the event data structure because $_ is a writable alias into @event.

      I noticed that too and not sure if it is relevant, but I have been stung by something similar in the past. I would re-write the subroutine as:
      my $type = event_hasher(\@event); sub event_hasher { my $cur_event = shift; my %event_hash; for my $item (@$cur_event) { $item =~ s/://g; my ($key,$value) = split / /, $item; $event_hash{$key} = $value; } return \%event_hash; } $queue_changes{$event_id} = $type;

      Update:
      As choroba points out below this doesn't solve the problem - in fact it creates it! You need to make a copy of the input - but you are doing that anyway so the aliasing is not the problem. Using choroba's test with your subroutine:

      sub event_hasher { my @cur_event = @{$_[0]}; my %event_hash; foreach(@cur_event){ $_ =~ s/://g; my @place_holder = split / /, $_; $event_hash{$place_holder[0]} = $place_holder[1]; } return %event_hash; } my @event = ('a:b c:d'); print "@event\n"; my %type = event_hasher(\@event); print "@event\n"; # prints: a:b c:d a:b c:d
        The problem remains: $item is still an alias to the members of the @event.
        #!/usr/bin/perl use warnings; use strict; sub event_hasher { my $cur_event = shift; my %event_hash; for my $item (@$cur_event) { $item =~ s/://g; my ($key, $value) = split / /, $item; $event_hash{$key} = $value; } return \%event_hash } my @event = ('a:b c:d'); print "@event\n"; my $type = event_hasher(\@event); print "@event\n";
        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Ending up with duplicate keys in a hash?\
by Anonymous Monk on Jan 10, 2016 at 17:48 UTC

    How did you manage to insert the event into %queue_changes? Perl ought to die with "Invalid value for shared scalar" when you try to assign an unshared reference. Please try the following:

    $queue_changes{$event_id} = shared_clone(\%type);

Re: Ending up with duplicate keys in a hash?\
by hotchiwawa (Scribe) on Jan 10, 2016 at 14:11 UTC
    As said above...
    $event_id = int(rand(25000));
    is not guaranteed to produce unique event IDs.

    I agree :)

    You could use a function with locking that will increase a variable and return its value.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1152414]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2024-04-18 15:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found