Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

can I change hash keys or values directly

by misterperl (Pilgrim)
on Jan 28, 2021 at 21:06 UTC ( [id://11127604]=perlquestion: print w/replies, xml ) Need Help??

misterperl has asked for the wisdom of the Perl Monks concerning the following question:

Hello kind monks. I have a case where a some of my hash keys need adjusting. I thought perhaps:
map s/thesekeys/thosekeys/,keys %h;
.. would be a nice approach but no joy. I guess that "keys %h" is an anonymous array copy of the keys; not the actual keys, because the were unchanged. I never realized that.

Of course I can do this in a loop, and I even managed to do it with a slice followed by a delete. So I'm not asking HOW to accomplish the task - as Larry told me (over and over as I recall) at the OSCONs- there are MANY ways... You know the rest :)

I guess what I'm really asking is, is there a perlVar or function that is the TRUE array of hash key and one of values that,when changed, changes the keys or values?

I guess I see a dilemma with this - since adjusting keys may make the array of keys shorter, and it may not be determinant on how to adjust the values. So this might not even be practical. Although being able to map VALUES does seem appealing and practical. I'm interested in what you expert Monks think?

TY & Blessed Be

Replies are listed 'Best First'.
Re: can I change hash keys or values directly
by choroba (Cardinal) on Jan 28, 2021 at 21:34 UTC
    There's no array of keys or values. There's a list of keys and list of values, and the list of values in fact contains the values themselves, so you can modify them, e.g.
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my %hash = (key1 => 'val1', key2 => 'val2'); s/val/V/ for values %hash; say "$_ => $hash{$_}" for keys %hash;

    You can't modify the keys in this way, though (and both the facts are documented in keys and values. That's because you can't change a key in a hash, you need to remove the old one and create a new one, because the after a key change, the value is most probably going to be stored in a different place.

    Using a slice is probably the best you can do.

    my @old = keys %hash; @hash{map uc, @old} = delete @hash{@old};
    Be sure to keep the keys unique after the modification!

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: can I change hash keys or values directly
by GrandFather (Saint) on Jan 28, 2021 at 21:33 UTC

    Without digging into the perl code I can't be sure (although it's likely someone will know for sure), but that isn't really how a hash works. In any implementation I'm aware of there is no explicit list of key values and any equivalent of an in place edit of the keys would require about as much work under the hood as your slice/delete approach. See Hash_table to get an idea of what is going on under the hood.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: can I change hash keys or values directly
by tobyink (Canon) on Jan 30, 2021 at 21:48 UTC

    Can't do it with built-ins, but it's easy to create a function that can do it.

    #!perl use strict; use warnings; BEGIN { package Hash::Adjust; use Exporter::Shiny 'ha'; sub _generate_ha { my ( $me, $name, undef, $opt ) = ( shift, @_ ); my ( $aref, $bref ) = do { no strict 'refs'; my $caller = $opt->{into}; ( \${"$caller\::a"}, \${"$caller\::b"} ); }; return sub (&\%) { my ( $code, $hashref ) = ( @_ ); @_ = (); my %tmp; while ( ( $$aref, $$bref ) = each %$hashref ) { &$code; $tmp{$$aref} = $$bref if defined $$aref; } %$hashref = %tmp; }; } }; use Hash::Adjust 'ha'; use Data::Dumper; my %hash = ( FOO => 42, BAR => 666 ); ha { $a =~ s/O/o/g; ++$b; } %hash; print Dumper \%hash;

    The ha function is called like ha { CODE } %hash and modifies the keys and values of the hash. The code gets the key in $a and the value in $b; the same special global variables used by sort. If the code block sets $a = undef then that key-value pair will be removed from the hash.

Re: can I change hash keys or values directly
by AnomalousMonk (Archbishop) on Jan 29, 2021 at 00:53 UTC
    ... is there a perlVar or function that is the TRUE array of hash key ... that,when changed, changes the keys ...

    Others have answered this in the negative and explained why it can't be done. In support, consider that a hash (or associative array) key is an index of the array. Would you expect that in a positional array @ra there would be some way to directly operate on the index of element $ra[1] (i.e., 1) to change it to, say, 42 and expect the element to change its position in the array correspondingly? I suppose this kind operation is conceivable, but the fact that it doesn't exist in any language (AFAIK) suggests its inutility. (Update: I'd be interested to know if there's a language with an operator like this for either type of array.)


    Give a man a fish:  <%-{-{-{-<

      Don't say that too loud or you'll be seeing it in PHP!

      Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      > I'd be interested to know if there's a language with an operator like this for either type of array.)

      Short: alists in Lisp allow this

      Long: Dunno if there is an "operator", but Lisp has various approaches to associative arrays and alists are the most basic. They are basically just lists of "pairs" with linear search, hence slow.

      Disclaimer: my Lisp is eLisp

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      PS: obligatory Xkcd ;)

Re: can I change hash keys or values directly
by jwkrahn (Abbot) on Jan 28, 2021 at 21:35 UTC
    $ perl -le'my %x = "A" .. "F"; print "@{[ %x ]}"; $_ = lc for %x; prin +t "@{[ %x ]}";' C D A B E F C d A b E f

    So, no, you can't modify hash keys directly.

    However, you can do this:

    $ perl -le'my %x = "A" .. "F"; print "@{[ %x ]}"; $x{ lc() } = delete +$x{ $_ } for keys %x; print "@{[ %x ]}";' E F A B C D c D a B e F
Re: can I change hash keys or values directly
by siberia-man (Friar) on Jan 29, 2021 at 11:39 UTC
    I came here to this thread to give my answer for your question. After reading the responses given by other monks I found out that absolutely the same answer has already been given by choroba. The code below is just example of the suggestion by choroba and mine.
    #!/usr/bin/env perl use strict; use warnings; use Data::Dumper; my $x = { a => [ 1, 2 ], c => [ qw( a b c ) ], d => 100 }; print 'original: ', Dumper $x; # one by one replacement: c -> b, d -> c $x->{b} = delete $x->{c}; $x->{c} = delete $x->{d}; print 'modified: ', Dumper $x; # in one run replacement: b, c -> c, d @$x{qw( c d )} = delete @$x{qw( b c )}; print 'reverted: ', Dumper $x;
      I appreciate the replies guys. I guess I was already on the best approach using the slice. The link to "values" and it's helpful to know that they be operated on directly. I tried it
      map s.(\d+).$1+1/e,values %nums;
      which worked great. Love it. Helpful & useful. TYVM

      If I'm reading that right, it does appear from that link that as an experiment, perhaps keys also worked that way for some brief Perl versions:

      use 5.012; # so keys/values/each work on arrays
        > perhaps keys also worked that way

        No, it means you can write

        #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use Syntax::Construct qw{ keys-array }; # More explicit than 5.012. my @array = qw( abc def ); my @keys = keys @array; # 0, 1 my @values = values @array; # abc, def while (my ($index, $value) = each @array) { say "$index $value"; # 0 abc, 1 def }

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: can I change hash keys or values directly
by Marshall (Canon) on Jan 30, 2021 at 03:20 UTC
    What you want to do is actually fairly straight-forward.
    For each existing key, see if there is a new name for that key.
    If there is, then delete the current hash entry and create a new entry with the new name and with the current value.

    Here is some simple example code.
    There can be complications with this, like what happens if the new key name already exists?
    Also be aware that hash keys are presented in random order.

    Anyway, modify this code to meet your requirements and to take into account what I mentioned.

    use strict; use warnings; use Data::Dump qw(pp); $|=1; my %hash = ( 'a' => 2, 'b' => 3, 'c'=> 4, 'd' => 5); my %key_map = ('a'=> 'abc', 'd'=> 'def'); pp \%hash; pp \%key_map; ############# added code ######### print "Perl size of 'hash' ", scalar(%hash),"\n"; foreach my $key (keys %hash) { if (defined (my $new_key = $key_map{$key})) { print "New key for $key is $new_key\n"; my $current_value = $hash{$key}; delete $hash{$key}; $hash{$new_key} = $current_value; } } print "result of hash mods:\n"; pp \%hash; __END__ Prints: { a => 2, b => 3, c => 4, d => 5 } { a => "abc", d => "def" } Perl size of 'hash' 3/8 New key for d is def New key for a is abc result of hash mods: { abc => 2, b => 3, c => 4, def => 5 }
    Update: It may not be obvious to you, but changing the name of a single hash key (whether longer or even shorter) could potentially cause Perl's internal representation of the hash table to be completely recalculated and reorganized. In general as a Perl'er, don't worry about it. In the internal "guts", the Perl code that deals with hashes is written in C and I must add that it is quite efficient at what it does and how it does it.

    Update2: Every map statement IS a loop! It may not look that way to you, but that is what it is. Shorter source code does not necessarily equate with shorter or even more efficient executed code.

    Update3: Added scalar value of a %hash to the code.

Re: can I change hash keys or values directly
by Marshall (Canon) on Feb 01, 2021 at 00:33 UTC
    Hi misterperl,
    I don't think that my previous "how to do it" post in this thread didn't adequately answer adequately answered your question.
    You wrote: "I guess what I'm really asking is, is there a perlVar or function that is the TRUE array of hash key and one of values that,when changed, changes the keys or values?"
    The answer is "no". There is no TRUE array of hash keys.

    I will attempt to give a "short" answer to your "why not?" question. There may be some slight technical inaccuracies in pursuit of brevity. Also I will need some "C lingo" to explain what a Perl hash table actually "is", "under the covers".

    The first step in the hashing process is to calculate what is called the "binary hash value". This is a single unsigned binary number. Each character in the "ASCII hash key" is fed into essentially an equation and as I said, the end result of this calculation is a single unsigned binary number. This calculation is designed to be very fast (and it is). This is not a simple "checksum" - it is more analogous to a CRC calculation.

    A Perl hash starts out with 8 of what are called "buckets". Thus, the bucket array starts out at length 8. Each element of this bucket array is either not used (NULL pointer) or is a C pointer to a linked list of C structures. The C structure will contain things like the actual full ASCII key name that you see as the user of the hash and the the user accessible value of that key. I don't remember if it contains the originally calculated "binary hash value" for each particular ASCII key string or not.

    Ok, to use this newly created hash with 8 "buckets", Perl uses only the least significant 3 bits of the "binary hash value" (000-111 binary). This is used as an array index into the bucket array. So when you have your write something like $hash{'user'}='XXZZY";, Perl calculates the "binary hash value" for "user", that value is "masked" so that only the least significant 3 bits are used. Perl looks at the "C linked list" associated with that "bucket". If there is no list there already, a new linked list is started with this new value. What happens if there is already a linked list at that array index? Perl has to look at each element of this C linked list to make sure that it is not modifying something that is already there. If an element with hash key, "user" is not there already, it is added to the end of the linked list.

    If the hash structure stayed at only 8 "hash buckets" that wouldn't be very useful for a hash table with say 1,000 entries! So Perl has an algorithm to decide when to double the size of the hash. When the hash size doubles, Perl creates a new "bucket array" of say 16 values and then uses the least significant 4 bits of the "binary hash value" as an index into that new C array.

    Every element of the original 8 bucket hash has to be re-visited to see what new bucket it falls into now! Once an additional bit of the "binary hash value" is exposed and used, that changes things!

    If you are "with me" so far, making a hash key "a shorter ASCII string" will not necessarily decrease the size of the internal hash. It may even cause the size of the hash to double! This new "shorter key" has to be moved to the "right bucket". If that potentially different bucket is "over used", Perl may double the size of the hash to take care of that.

    In modern Perl versions, the calculation of the "binary hash value" has a "per run" random fudge factor. This was done to prevent some security issues with having a repeatable "binary hash value". I don't understand all of the security issues, but this is what it is.

    The "usage of a Perl hash" is accessible by the scalar value of the hash. This is a string, like "3/8" or "2/8". This means that either 3 or 2 buckets are used out of 8 possible buckets. I have updated my "how to do it" post with code that shows that. I get randomly either 3 or 2 on multiple succesive runs of the same code. This is as expected.

    It is possible to "pre-size" a Perl hash, by my %hash=1000; That will round the number of "buckets" up to 1024 ( binary power of 2 number). This prevents overhead associated with the hash doubling, 8,16,32...1024. However, in my testing, this doesn't matter at all provided that you are doing something significant with the hash once it is created. I was working with hash sizes of 100K entries and above. Perl is very efficient in how it manages its internal hash tables. Just use the features of the language!

    Note/Update 1: Doubling the hash size is actually not a significant memory use. This only doubles the array of pointers to C's linked list of structures. But in order to do this, Perl has to revisit each element in the hash and decide where it needs to go next. However as I mentioned before, Perl is very, very efficient about how it does a "doubling of the internal hash size".

    Again, there is no "TRUE list" of hash keys. To generate keys %hash, Perl has to traverse the entire internal hash table and return a list of the results to you. But again, Perl does this very, very quickly.

    I post this a separate response instead of an update to my original response because the content and subject matter is completely different.

      > Again, there is no "TRUE list" of hash keys. To generate keys %hash, Perl has to traverse the entire internal hash table and return a list of the results to you. But again, Perl does this very, very quickly.

      Sorry, but from my understanding this is incorrect.

      There is a Linked List with keys and values in a fixed randomized order which is used by iterators like each , as well as keys and values

      There is also a C-Array of "Buckets" for quick look up via hash-function, and these "Buckets" also point to a Linked List with all entries having the same hash-value aka "Collisions".

      But this array is not traversed to generate the output for keys

      UPDATE

      Sorry ... looks like I had an incorrect or outdated source

      according to https://www.cpan.org/authors/id/G/GA/GAAS/illguts-0.09.pdf

      * RITER, EITER: The first two fields are used to implement a single iterator over the + elements in the hash. RITER which is an integer index into the array referenced by ARRAY an +d EITER which is a pointer to an HE. In order find the next hash element one would first + look at EITER->next and if it turns out to be NULL, RITER is incremented until ARRAY[RITE +R] is non-NULL. The iterator starts out with RITER = -1 and EITER = NULL.

      and I'm not sure if this is still up to date, since new security requirements led to more randomization

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        Hi Rolf!

        I wrote: I will attempt to give a "short" answer to your "why not?" question. There may be some slight technical inaccuracies in pursuit of brevity. Also I will need some "C lingo" to explain what a Perl hash table actually "is", "under the covers". Underlining added for emphasis.

        I am actually surprised that there weren't more: "hey, you got detail X wrong" posts!

        My goal was to explain the basic idea and then show things like the scalar value of %hash which the user can access (added to my post using actual Perl code). Perl is very much a "live language" and updates are on-going. Its been maybe 5+ years since I looked seriously "at the guts".

        Update:
        I see the posts with various links to "Perl Guts". Whether the OP needs that level of detail is up to him. I tried to answer this basic question: is there a perlVar or function that is the TRUE array of hash key and one of values that,when changed, changes the keys or values?

        Perl is like an onion. To use an onion, first you have to take the thin skin off. Then you start slicing the onion. There are many layers. That is fundamental to what an onion is! So, how come some onion slices have a green thing in the middle? That is a level of detail that I didn't address and probably the OP doesn't need to know about in order to use an onion effectively. That is the best analogy that I can come up with at the moment. I hope that my post can be understood in time measured in minutes. To understand exactly everything about how Perl makes hashes, requires time measured in days, not minutes.

      TYVM!

      I pretty familiar with hash internals from a user viewpoint but your details were most instructive. I read it twice but may need to read a few more times to let it sink in..

      Curiously, I recall my FIRST programming assignment, way back in 1977, in APL, was to write a hash table manager, which stored and retrieved hashed elements. Thinking back to how that worked- I get what you're saying. Its not stored as an array (even though keys %h returns one).. I also recall writing a non-hashed storage manager and for reasonably-sized tables, the speeds were of course much better. I get that :)

      Much appreciated you are the guru of Perl Hashes!

        I'm glad that you liked my post!

        I actually used the Perl hash algorithm in a C project once. So I got pretty far into how Perl did it.

        I wrote a .h file, memtracker.h for a local college. Students just included this .h file in their C or C++ program and magic happened. Without any source mods or link dependencies, this thing tracked C or C++ memory allocations and deallocations and when the user program exited, it showed whether there were any memory leaks and a table showing how all of this played out. It would say things like "on line X, that memory allocation has no corresponding deallocation", etc. Anyway I used a hash as the main data structure to keep track of things. Turned out to be a more complicated project that I had first thought - mainly due to making it work with 2 compilers each of C and C++ using the same single .h file. Final file was somewhat north of 1,500 lines of C with a lot of C pre-processor voo-doo. Anyway turned out to be a fun project. One C++ prof had his own version of this, but it was so slow, it was adding 20 minutes execution time to a large C++ lab! My version was so fast that it wasn't even user perceptible and it had more features. Using a very efficient data structure was a big part of the speed improvement.

Re: can I change hash keys or values directly
by glycine (Sexton) on Feb 04, 2021 at 06:42 UTC

    your question sounds like original hash table in C or C++, when you sign a value, you should use hash function to get a new index for that value, finally put every things in a large enough array? I read this from my old data structure book

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11127604]
Approved by haukex
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-03-29 01:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found