Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi I am newbie to Perl, i need some help in some references and if any can help me with a sample or script for deduplication of the database Thanks in advance

Replies are listed 'Best First'.
Re: DeDuplication
by larsen (Parson) on Jun 16, 2002 at 10:27 UTC
    If your goal is to remove duplicates from a table in a RDBMS, I think you don't need Perl. Look at SELECT syntax (from MySQL documentation) and you'll find the useful DISTINCT option.

    So you can populate a new table selecting distinct rows from an older one.

    You can find an example of this tecnique in the documentation of DBSchema::Normalizer, from our gmax (I suggest visiting his homenode for further interesting DB-related articles).

Re: DeDuplication
by Corion (Patriarch) on Jun 16, 2002 at 08:59 UTC

    While Perl has many cases of DWIM (Do What I Mean), there is no deduplicate function either in the core (see perldoc -f deduplicate) nor in the DBI module (see the documentation there). So it might help if you were more specific about what the exact problem is, what methods are available to you and what the intended action and result are.

    For example, your situation might be that you have a column in your database that should be unique, let's call this row the customer number. But as the database didn't know about this, there are now several customers with the same customer number. There are also several rows for the same customer with the same customer number.

    What approach would be correct in your business logic to resolve these problems ?

    There is also a very good meta-hint I can give you, How (Not) To Ask A Question.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: DeDuplication
by Beatnik (Parson) on Jun 16, 2002 at 08:58 UTC
    What type of database? Which records do you want out: record key based or any field matched, etc? Anyway, here's an example to deduplicate array elements
    %unique = (); foreach $item (@array) { $unique{$item}++ } @unique = keys %unique;
    A simple database can be done using an array ofcourse :)

    Greetz
    Beatnik
    ... Quidquid perl dictum sit, altum viditur.