...I need to remove duplicate product entries...
Perhaps your mentioning the address comparison was only a solution proposal. If I'm understanding you correctly--that you only want to "remove duplicate product entries"--then consider the following:
use strict; use warnings; local $/ = ''; my ( %products, %records ); while (<>) { if (/(Product.+)/) { $products{$1}++; $records{$1} = $_; } } print $records{$_} for grep $products{$_} == 1, keys %records;
Usage: perl script.pl dataFile [>outFile]
Output on your data set:
Product 3 ------------------------------------------------------------------ storeId = 2123 phoneNumber = (111) 111-1111 availbilityCode = 1 stockStatus = Limited stock distance = 8.83 city = some city fullStreet = some address Product 2 ------------------------------------------------------------------ storeId = 2117 phoneNumber = (111) 111-1111 availbilityCode = 2 stockStatus = In stock distance = 7.49 city = some city fullStreet = some address
The script builds two hashes: one to track the number of times a product number occurs (%products) and one for the records (%records) keyed on the product number. A record is printed only if the product number was seen only once.
Hope this helps!
In reply to Re: Removing Duplicates from a multiline entry
by Kenosis
in thread Removing Duplicates from a multiline entry
by diamondsandperls
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |