I do not believe the use of JSON as a serialization format should influence your choice of internal data structures. Nor should you assume that the data structures you use to sort this mess out are the ones you want to serialize.

If you could fully parse all the forms this would be easy. The problem is the 13-digit one. This one can probably be disambiguated if you can compute the check digit from the core serial. The problem is that 1/10 chance that the serial could be read both ways.

What I think I would try first is building two hashes. One would be keyed by core serial and contain all the variants that were found of it (including fact that the core variant was found). The other would simply record 13-digit serials that can not be disambiguated. Once you have all the core serials you can make a pass through the 13-digit serials and try to match them up. Important: your code should check for the case that one of these 13-digit serials can not be disambiguated even after all core serials are known, and complain mightily about all such found.


In reply to Re: Merging multiple variations of a serial number by Anonymous Monk
in thread Merging multiple variations of a serial number by Doozer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.