rmcgowan has asked for the wisdom of the Perl Monks concerning the following question:

I'm accessing a REST API interface to an Oracle database. The API is written in Java. The retrieved data is JSON encoded. I cannot make any changes on the server side, and have to live with whatever is sent.

There are a number of field and table names that have been changed on the server side for various reasons. For example 'abstract' is 'incidentAbstract', and 'cross_reference' is 'crossReference', in the JSON encoding.

I have a number of scripts that have been developed over a decade or more, that use DBI and DBD::Oracle to access data, using the database names. I now need to port these to using the REST API and I'm trying to do this with minimal changes to the scripts themselves. To do this, I need a way to restore the original field or table names for the modified ones in the JSON decoded data. For example:

Before After { { incidentAbstract => 'string' abstract => 'string' } }

I found a module on CPAN, Hash::KeyMorpher, that almost does the job, but not quite. It only changes the style of the name, but keeps the name, as in KeyFile to key_file or vice versa.

I can write my own, but would like to be sure someone hasn't already done this, or that there isn't a "better way" to handle it.

Thanks.

update

Many thanks for the pointers to Hash::Map and Data::Visitor::Callback, etc. I had actually looked at Data::Dumper as a potential candidate but decided there might be a "simpler", or at least different, module I could use.

It also turns out the JSON modules have a method, filter_json_object, that could also work:

#!/usr/bin/perl -w use strict; use Data::Dumper; # Testing JSON filter_json_object method, to convert 'field names'. use JSON::XS; my $sub = sub { my $jsRef = shift; my %map = (one => 'uno', two => 'dos', three => 'tres' +); my $hRef; foreach my $k (keys %$jsRef) { $$hRef{$map{$k}} = $$jsRef{$k}; } return $hRef; }; my $hRef = {one => 1, two => 2, three => 3}; print Dumper ($hRef), "\n\n"; #my $json = JSON:XS->new->filter_json_object->encode ($hRef); my $json = JSON::XS->new->allow_nonref->filter_json_object ($sub); my $jsonStr = $json->encode ($hRef); #print "$jsonStr\n"; my $newHRef = $json->decode ($jsonStr); print Dumper ($newHRef); --- $VAR1 = { 'three' => 3, 'one' => 1, 'two' => 2 }; $VAR1 = { 'uno' => 1, 'dos' => 2, 'tres' => 3 };

I'll be able to do what's needed with one or another of the suggestions, I've no doubt ;)

Replies are listed 'Best First'.
Re: Translate or morph hash keys to different names
by LanX (Saint) on Nov 06, 2013 at 20:32 UTC
    You didn't tell us, if your data is nested or just flat.

    Perl doesn't allow to change keys in place, you either need to copy to new a hash, or delete and recreate every single key to be transformed. If the hashes are nested you need to walk thru the tree.

    If I were you I would just check if the JSON string has a pretty format which allows a simple regex to translate keys only.

    update

    proof of concept:

    DB<100> use JSON DB<101> $hoH = { nested => { crossReference => 42, incidentAbstract => "st +ring" }, } DB<102> $str = (new JSON)->pretty(1)->encode($hoH) { "nested" : { "incidentAbstract" : "string", "crossReference" : "42" } } DB<103> %translate= ( 'crossReference' => 'cross_reference' , "incidentAbstract" => 'abstract', ) DB<104> $or_keys = join "|", keys %translate => "incidentAbstract|crossReference" DB<105> $str =~ s/^(\s+")($or_keys)("\s+:)/$1$translate{$2}$3/gm => 2 DB<106> $str { "nested" : { "abstract" : "string", "cross_reference" : "42" } }

    of course this can also be done with Data::Dumper ...

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      The data is nested. And I understand that keys can't be changed directly.

      The suggestion of changing things in the JSON string, before decoding, is good and makes a lot of sense.

      My primary concerns are that I had missed something, perhaps in the JSON modules, that could do this for me, or an existing other module to do the morphing, so I didn't need to write my own.

      Thanks for the suggestion.

Re: Translate or morph hash keys to different names
by kschwab (Vicar) on Nov 06, 2013 at 20:56 UTC

    Hash::Map has several options to do what you are asking for.

    If this is a one-time problem, though, I would just use Data::Dumper to dump the hash to a text file, then make the changes with a text editor.

Re: Translate or morph hash keys to different names (Data::Walk, Data::Rmap, Data::Visitor::Callback)
by Anonymous Monk on Nov 06, 2013 at 21:03 UTC

    For arbitrary depth structure you could use (untested)

    use Data::Visitor::Callback; my $v = Data::Visitor::Callback->new( ignore_return_values => 1, hash => sub { my %map = ( qw/ abeLincoln lincoln / ); my $ref = $_; ##!!! my %new ; for my $key ( %{$ref} ){ my $fkey = $map{$_} ; $new{ defined $fkey ? $fkey : $key } = $ref->{$key}; } %{$ref} = %new; }, ); $v->visit();
    Data::Walk, Data::Rmap, Data::Visitor::Callback
Re: Translate or morph hash keys to different names
by locked_user sundialsvc4 (Abbot) on Nov 07, 2013 at 14:43 UTC

    I would simply define a package that “represents” the RESTful interface to the rest of the system(s), and, within that package, define a hash that maps old-names to new ones:

    my $FIELD_MAP = { // LIST ALL SERVER-SUPPLIED FIELDS EVEN IF NAME DOESN'T CHANGE // SERVER'S NAME ... => OUR NAME 'incidentAbstract' => 'abstract', 'cross_reference' => 'crossReference, ... };

    Then, simply iterate through the keys in this map to populate a new hash that has the preferred field-names in it.   (Since both hashes contain references to the data in most cases, there probably won’t be too-much data actually moving around in memory as you do this.)

    (Notice the comment that I made in the code-block.   $FIELD_MAP is supposed to define every field-name that this program expects to deal with in the data, and I think that your code should die if it ever encounters a field-name that is not.   I think that code should never be “trustful,” especially of remote systems.   (Data-providers and data-consumers have bugs in their code all the time, just like we all do, so don’t let ’em grind you down...)

    This approach will have a number of advantages:   it will encapsulate the interface on behalf of the rest of the system or systems, and it will provide a consistent (and well-documented in the code ...) naming convention that remains the same as before.   (If you have a number of inconsistent systems on your side, subclassing could be used to good effect.)   The package is also suspicious of its host, therefore trustworthy:   if it runs to completion, it must be doing the right thing.