in reply to Re: Merging multiple variations of a serial number
in thread Merging multiple variations of a serial number

> The serial numbers are used on a piece of equipment that we source from an external company and deploy to engineers. We don't have any control over serial number generation

It really is important to know, if the serial numbers stay stable like IDs and how many there are.

You can't seriously have 1e10 pieces of equipment°, so a lookup hash with correct numbers will help you filter out impossible matches.

> different people/departments have received, built, and added to, the various files using different formats of the serial numbers.

As I already said, it is very probable, that the effects of those people can be localized to certain files and time periods.

Creating a histogram for each file will help you determine which 13number format was used and for which timestamps.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

°) 1e11 even with the check digit.

  • Comment on Re^2: Merging multiple variations of a serial number

Replies are listed 'Best First'.
Re^3: Merging multiple variations of a serial number
by hippo (Archbishop) on Jul 29, 2022 at 12:42 UTC
    Creating a histogram for each file will help you determine which 13number format was used and for which timestamps.

    Frequency analysis is definitely a good plan in this scenario. (++)


    🦛

Re^3: Merging multiple variations of a serial number
by Doozer (Scribe) on Jul 29, 2022 at 12:29 UTC

    > It really is important to know, if the serial numbers stay stable like IDs and how many there are.

    They should now stay stable in the current format. We currently have around 3,500 units/serial numbers