in reply to Re: Removing Duplicates in a HoH
in thread Removing Duplicates in a HoH

I normally would store the information by part number however the file that produces this information is extremely large and has duplicate part numbers. This script is actually trying to combine and reduce the data to single part numbers as well as group the tags and add the quantities.

Replies are listed 'Best First'.
Re^3: Removing Duplicates in a HoH
by state-o-dis-array (Hermit) on Dec 20, 2010 at 15:19 UTC
    Storing the information by part number does what you are looking to accomplish, see the response of scorpio17 which provides an example of what I'm talking about.

      I really like this reduction. I previously used the hash to remove duplicate lines like below:

      1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 559: partnum=2203505000,description=CONDUCTIVITY CELL,quantity=2.0000, +tags=AE-100 559: partnum=2203505000,description=CONDUCTIVITY CELL,quantity=2.0000, +tags=AE-100 AE-200 559: partnum=2203505000,description=CONDUCTIVITY CELL,quantity=2.0000, +tags=AE-100 AE-200 559: partnum=2203505000,description=CONDUCTIVITY CELL,quantity=2.0000, +tags=AE-100 AE-200 559: partnum=2203505000,description=CONDUCTIVITY CELL,quantity=2.0000, +tags=AE-100 AE-200

      Once I loaded this into a hash it reduced to the following:

      1: partnum=1003382553-M25,description=CNTFGL PUMP,quantity=1.0000,tags +=PU-200 559: partnum=2203505000,description=CONDUCTIVITY CELL,quantity=2.0000, +tags=AE-100 AE-200

      The current modification by scorpio17 adds the duplicates to gether, unfortunately.

      Thanks for the help and the insight. I really appreciate it. Shawn Way