While I was sort of rolling my eyes at the request to write a new JSON module just to avoid 'croak', you got me thinking, and it actually would be rather useful to have a parser function that starts from pos($scalar) and converts valid JSON into Perl SV* until the first parse error, then returns what it built so far along with flags for how it ended. It would be especially useful if it returned partial results that could be resumed on additional input, allowing you to feed the parser with buffer segments. Or like you suggested, ignore certain types of decoder errors.

The C function might look like

bool json_parse_more(pTHX_ struct json_parse_state *state, // configuration and error messages SV *input, // any scalar int input_pos, // byte offset within the scalar SV *output // empty SV, destination for data );
and you could call that recursively to assign the output SV with the progress-so-far of whatever it found on input. As long as the state was unique to the thread, it would be thread-safe. It's probably easiest to store all the error info into the struct.

You could probably read the implementations of all the other JSON modules to flesh out the implementation of that one function, then you could wrap that one function in XS, along with some XS methods to construct/read/write the state struct, and you'd be on your way.

When you get to the part of decoding unicode, you'll see the solutions in all the other JSON modules, but you need to fully understand what they're solving. A perl SV can either be raw bytes or Characters, and the Perl is_utf8 flag is *not* a proper indication of this. The perl is_utf8 flag only indicates to the back-end whether you need to use utf8 functions to read the characters or if there is one character per byte. There can be cases where a byte > 127 is stored as a utf8 sequence even though it wasn't intended by the application to be a character yet. So, you need to let the user specify whether they think their string contains bytes or characters when calling your API, then do the decoding in your module if they say the input needs decoded. Again, the solutions for these problems will all be found in the other existing JSON modules. As it happens, the UTF8 rant by MLEHMANN in the JSON::XS manual is the explanation that finally showed me the right way to think about Perl's utf8 flag.


In reply to Re^3: Can someone please write a *working* JSON module (Send money) by NERDVANA
in thread Can someone please write a *working* JSON module by cnd

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.