Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I have some data in the form:

dog|pup|walk dog|adult|feed dog|mid|rabies cat|kit|climb cat|old|declaw cat|mid|play etx
I am trying to load this into a database - but the first field is unique - they are in groups of 3. How do I parse through and get the first field to be unique...?

So i would get

dog|pup|walk cat|kit climb
I want the first field to be unique for each record...

I hope that makes sense..

Thanks,

VVP

Edit by tye

Replies are listed 'Best First'.
Re: VVP asks Unique field data?
by particle (Vicar) on Feb 07, 2002 at 21:30 UTC
    what have you tried so far?

    and which of the records with duplicate first fields do you want to insert into the database? i don't quite understand why you're doing this, perhaps you need to rethink your database structure. as it stands, there are a few options.

    i can help you with the pseudocode:

    == as always, use strict and -w
    == of course, use DBI for the database interface
    == create an empty hash to store unique keys
    == create an empty array for data to process
    == for each data line: == | use split with'|' on line, store values in temp array
    == | is first value from the array (temp_array[0]) in unique hash?
    == | | yes : dump temp array and move on
    == | | no : add first value to unique hash, add array to data_to_process array
    == for each array in data_to_process array
    == | add to database

    this assumes you want to keep the first line with a repeated first field. if you want to keep the last line, the logic will have to change appropriately.

    ~Particle

Re: VVP asks Unique field data?
by thor (Priest) on Feb 08, 2002 at 02:24 UTC
    There are a couple of ways to do this. If you want to disallow duplicates of the first field in general, then you should cerate a unique index on that column. This will prevent the following situation:

    script invocation 1: insert dog|brown|fuzzy and cat|white|evil
    script invocation 2: insert dog|grey|lucky and cat|grey|still evil

    If you try that with the unique index, then you will get an error similar to the following "attempt to insert duplicate key row".
      That is what I am trying to do - load a file with duplicates in the first field...i only want to load the first of 6 records that are duplicated, then move on to the next group. The date looks like this: CH CO N 303 CH CO Y 303 CH LA N 303 CH LA Y 303 CH OT N 303 CH OT Y 303 CHAA CO N 303 CHAA CO Y 303 CHAA LA N 303 CHAA LA Y 303 CHAA OT N 303 CHAA OT Y 303 CHAB CO N 303 CHAB CO Y 303 CHAB LA N 303 CHAB LA Y 303 CHAB OT N 303 CHAB OT Y 303 And goes on for 6000 records...any help? Thanks, He who laughs last, doesn't get the joke.
        Hmmm....that is a bit trickier, as there is not an easy way to do that with sql (you could write a trigger, but that is an exercise left to the reader). What I might do in this situation is create a hash table and keep track of the number of times the first field has been seen. Some code (assume | delimited fields)
        #!/bin/perl -w use strict; open(FILE,"myfile") or die "Couldn't open myfile"; my %first_record_hash; while (<FILE>) { my @array = split '|'; if ($first_record_hash{$array[0]} < 6) { #insert record into database $first_record_hash{$array[0]}++; } else { next; } }
        I think that'll do the trick for you