in reply to Re^2: redundancy Checker
in thread redundancy Checker

At the time the second (redundant) data is entered you should notice that there is already an entry in the data base for the re-entered data.

At that time you either throw away the redundant data or replace/edit the existing data base entry.

Perhaps you need to show us the sort of code you have currently and explain where the problem is?


Perl is Huffman encoded by design.

Replies are listed 'Best First'.
Re^4: redundancy Checker
by Ovid (Cardinal) on Jul 27, 2005 at 03:01 UTC

    You still have to define "redundant". If you properly normalize addresses (something almost no one does), then each street should have one and only one entry in the database. However, five guys with the first name of "John" should probably not have that abstracted away into a single entry. Just because the data looks the same does not mean that it's the same thing.

    Further, and this is a heresy that many database purists would be horrified by, there are times that DBAs will deliberately leave data denormalized for performance reasons (though this should not be done until you've gone down other avenues of correcting the problem).

    We may be able to be more specific if you can describe at a higher level the problem you're trying to solve.

    Cheers,
    Ovid

    New address of my CGI Course.

Re^4: redundancy Checker
by Anonymous Monk on Jul 27, 2005 at 03:04 UTC

    Well I haven't code anything yet for the redundancy checker part. I am still planning on how best to do it. Array?

    I've don the data input part, but that just a simple SQL insert, and all the data are place in the database

    i.e.

    data1 | data2 | data3 | data4 | data5 |

    big small large medium good

    extra size bad small nice

      Ok, so give us a sample of the data, an indication of how much data there is, and the sort of redundancy checks you anticipate making.

      By the time you have done that you should have almost answered your own question unless you enter the realms of normalising nasty data (see Ovid's comment) or you end us with a huge amount of data.

      A hash would be the natural data type to store data in that is supposed to have unique keys


      Perl is Huffman encoded by design.