"I admit I am quite bad at parsing data."

Probably the first thing to do would be to understand complex data structures. Take a look at "perldsc - Perl Data Structures Cookbook". This explains arrays of arrays and hashes; hashes of arrays and hashes; and finally builds up to complex data structures such as you're dealing with here.

"Unfortunately it fails, as I am evidently interpreting badly the data structure (and a dumper did not help me a lot). Any suggestions?"

Telling us "it fails", without any additional information, is pretty much useless. Did you use the strict and warnings pragmata? If not, you should do so: let Perl tell you about possible problems — it's a lot quicker than posting a question. Did you test "$result"? If so, you probably would have got a message about an uninitialised value (assuming you'd used warnings). However you tested "$result", and whatever messages you received, should be reported.

Telling us "a dumper did not help me a lot" is, again, not much use if we are to help you. Which "dumper" did you use? In what way was its output unhelpful?

I copied the JSON code you posted to 'pm_1201279_input.json'. My usual preference is the CPAN module Data::Dump, mainly for the compact output.

$ perl -e 'use JSON; use Data::Dump; my $x = do { local $/; <> }; dd(d +ecode_json $x)' pm_1201279_input.json { def => [ { pos => "noun", text => "time", tr => [ { gen => "m", mean => [{ text => "day" }, { text => "momen +t" }], pos => "noun", syn => [ { gen => "f", pos => "noun", text +=> "volta" }, { gen => "m", pos => "noun", text +=> "momento" }, { gen => "m", pos => "noun", text +=> "Time" }, ], text => "tempo", }, ], ts => "ta&#618;m", }, ], head => {}, }

The "def => [" makes it pretty clear that the value of the key "def" is an arrayref. This should tell you that "$decoded_json->{'def'}{'tr'}" is clearly a problem: you need an array index where you have the hash key "{'tr'}".

Deeper into the structure, the Data::Dump output may be a bit too compact for you (at least until you're somewhat more comfortable with this degree of complexity). The core module Data::Dumper might be a better choice here.

$ perl -e 'use JSON; use Data::Dumper; my $x = do { local $/; <> }; pr +int Dumper decode_json($x)' pm_1201279_input.json $VAR1 = { 'def' => [ { 'text' => 'time', 'tr' => [ { 'syn' => [ { 'pos' => 'noun', 'text' => 'volta', 'gen' => 'f' }, { 'pos' => 'noun', 'text' => 'momento', 'gen' => 'm' }, { 'pos' => 'noun', 'text' => 'Time', 'gen' => 'm' } ], 'text' => 'tempo', 'gen' => 'm', 'pos' => 'noun', 'mean' => [ { 'text' => 'day' }, { 'text' => 'moment' } ] } ], 'pos' => 'noun', 'ts' => 'ta&#618;m' } ], 'head' => {} };

Take a look at the documentation for both of those modules. There are various ways to use them such that the output is more to your liking.

Even without using a dumper, you can troubleshoot problems of this nature by walking the list of keys and indices. Start at the beginning and see what each refers to. In this case, look at the "def" key first:

$ perl -E 'use JSON; my $j = do { local $/; <> }; my $p = decode_json +$j; say $p->{def}' pm_1201279_input.json ARRAY(0x7f8e1c881968)

As before, that's indicating that you need an array index where you have the hash key "{'tr'}". So, look at the first index:

$ perl -E 'use JSON; my $j = do { local $/; <> }; my $p = decode_json +$j; say $p->{def}[0]' pm_1201279_input.json HASH(0x7f89ee002e30)

Now you know you'll need a hash key, then an array index, then another hash key. What keys are there?

$ perl -E 'use JSON; my $j = do { local $/; <> }; my $p = decode_json +$j; say for keys $p->{def}[0]->%*' pm_1201279_input.json ts pos tr text

[See "perlref: Postfix Dereference Syntax" if you're unfamiliar with the '$p->{def}[0]->%*' syntax. If you're writing for older versions of Perl, you can use '%{ $p->{def}[0] }' instead.]

Now you've found your "tr" key. So "$decoded_json->{'def'}{'tr'}" needs to be "$decoded_json->{def}[0]{tr}" (when keys are just alphabetic strings, they're automatically quoted: saves typing and code clutter).

Keeping working through the structure: eventually you'll get to the "$decoded_json->{def}[0]{tr}[0]{text}" that ++roboticus showed; and subsequently to the remainder of your requirements.

Complex data structures can appear daunting when first encountered; however, they're fairly easy to master. Work through other problems using the techniques I've shown and you'll soon get the hang of it. Ultimately, you'll be able to just look at JSON data, such as you've shown, and know intuitively what Perl code you'll need to access whatever parts you're interested in.

— Ken


In reply to Re: json decoding by kcott
in thread json decoding by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.