hesco has asked for the wisdom of the Perl Monks concerning the following question:

This code:

$sth_mlg_addr_duplicate_check->execute($fields[17],$fields[18],$fields +[19],$fields[20],$fields[21],$fields[22],$fields[23]) or warn "While checking for duplicate mailing address $fields[17 +]; $fields[19] $fields[20]: " . $dbh->errstr; ($mlg_addr_id) = $sth_mlg_addr_duplicate_check->fetchrow_array(); my $mlg_id; if(!defined($mlg_addr_id) && length($fields[20]) == 2){ $sth_insert_mlg_addr->execute($fields[17],$fields[18],$fields[19 +],$fields[20],$fields[21],$fields[22],$fields[23]) or warn "While inserting new mailing address |$fields[17]|; |$ +fields[19]| |$fields[20]|, with id |$mlg_addr_id|: " . $dbh->errstr; $sth_mlg_addr_duplicate_check->execute($fields[17],$fields[18],$ +fields[19],$fields[20],$fields[21],$fields[22],$fields[23]) or warn "While checking for duplicate mailing address |$fields +[17]|; |$fields[19]| |$fields[20]|: " . $dbh->errstr; ($mlg_addr_id) = $sth_mlg_addr_duplicate_check->fetchrow_array() +; $mlg_id = $mlg_addr_id; } else { $mlg_id = $mlg_addr_id; # print STDERR "\$mlg_addr_id is already defined as $mlg_addr_id +.\n"; }
is throwing this error:

Use of uninitialized value in concatenation (.) or string at parse_vot +er_list.pl line 251, <$fh> line 3483. While inserting new mailing address |xxxx RUE LOUIS PHILLIPPE|; |MARRE +RO| |LA|, with id ||: ERROR: duplicate key violates unique constrain +t "mlg_addr_uniq_key" While checking for duplicate mailing address |xxxx RUE LOUIS PHILLIPPE +|; |MARRERO| |LA|: ERROR: current transaction is aborted, commands i +gnored until end of transaction block While checking for duplicate residential address: ERROR: current tran +saction is aborted, commands ignored until end of transaction block Issuing rollback() for database handle being DESTROY'd without explici +t disconnect().
even though, after having done so, and from a psql prompt I'm getting:

=# SELECT * FROM mlg_addr WHERE address1 LIKE '%RUE LOUIS PHILLIPPE' A +ND city = 'MARRERO' AND state = 'LA'; mlg_addr_id | address1 | address2 | city | state | zipcode5 | zipcode +4 | country -------------+----------+----------+------+-------+----------+-------- +--+--------- (0 rows)
For the gruesome details:

my $mlg_addr_duplicate_check =<<EOQ; SELECT mlg_addr_id FROM mlg_addr WHERE address1 = ? AND address2 = ? AND city = ? AND state = ? AND zipcode5 = ? AND zipcode4 = ? AND country = ? EOQ my $sth_mlg_addr_duplicate_check = $dbh->prepare($mlg_addr_duplicate_c +heck) or die $dbh->errstr; my $insert_mlg_addr =<<EOQ; INSERT INTO mlg_addr (address1,address2,city,state,zipcode5,zipcode4,country) VALUES(?,?,?,?,?,?,?); EOQ my $sth_insert_mlg_addr = $dbh->prepare($insert_mlg_addr) or die $dbh- +>errstr;
and:

=# \d mlg_addr Table "public.mlg_addr" Column | Type | Modifie +rs -------------+-----------------------+-------------------------------- +--------------------------- mlg_addr_id | integer | not null default nextval(('mlg_ +addr_id'::text)::regclass) address1 | character varying(65) | address2 | character varying(65) | city | character varying(40) | state | character varying(2) | zipcode5 | character varying(5) | zipcode4 | character varying(4) | country | character varying(45) | Indexes: "mlg_addr_id_key" UNIQUE, btree (mlg_addr_id) "mlg_addr_uniq_key" UNIQUE, btree (country, zipcode5, zipcode4, ad +dress1, address2)
The @fields array includes the following fields:

address1 = $fields[17] address2 = $fields[18] city = $fields[19] state = $fields[20] zipcode5 = $fields[21] zipcode4 = $fields[22] country = $fields[13]
What am I missing? This is after this script has already run for twenty minutes, importing nearly 27,000 records, including 17,000 some records into this mlg_addr table.

Why would this query fail when there obviously is no duplicate in that table?

-- Hugh

UPDATE:

In response to moritz's questions about the details on the schema and the queries and variables, I've added some of the gruesome details behind the readmore tags.

if( $lal && $lol ) { $life++; }

Replies are listed 'Best First'.
Re: duplicate key check isn't, why not?
by moritz (Cardinal) on Nov 23, 2008 at 15:52 UTC
    Why would this query fail when there obviously is no duplicate in that table?

    It doesn't fail on querying from the table, but on inserting. And because it fails, there are no duplicates in the table.

    I don't understand your code, it's full of variables that I haven't seen the definition of, neither do we know anything about the table, the queries or the constraints on the table, but if the database complains that you're trying to add a key that's already there, you can be pretty sure that that's what you're doing wrong.

Re: duplicate key check isn't, why not?
by oeuftete (Monk) on Nov 23, 2008 at 16:01 UTC

    What is the constraint defined by 'mlg_addr_uniq_key'? Is the key in your table 'mlg_addr_id'?

    It looks like you're trying to insert a row with an empty id (possibly that's what the uninitialized value warning was about), and probably you've run this same script before and successfully inserted a row with an empty id. (That an empty ID is allowed might be a separate problem.)

    Try something like this (or something along those lines):

    SELECT * FROM mlg_addr WHERE mlg_addr_id = ''

    and see if you get any hits. If you do, that's the problem.

    UPDATE: After seeing the "gruesome details", I'm way off base.

      Thanks for turning this one over for a moment.

      =# SELECT * FROM mlg_addr WHERE mlg_addr_id = ''; ERROR: invalid input syntax for integer: "" =# SELECT * FROM mlg_addr WHERE mlg_addr_id IS NULL; mlg_addr_id | address1 | address2 | city | state | zipcode5 | zipcode +4 | country -------------+----------+----------+------+-------+----------+-------- +--+--------- (0 rows)
      if( $lal && $lol ) { $life++; }
Re: duplicate key check isn't, why not?
by oeuftete (Monk) on Nov 23, 2008 at 16:22 UTC

    So here's the offending key:

    "mlg_addr_uniq_key" UNIQUE, btree (country, zipcode5, zipcode4, addres +s1, address2)

    The query you're using after the fact isn't checking against the key. You're looking at address1, city, and state... when address1 is the only field there that's part of the key. Try looking at just the fields that make up the key. I'd start by just looking for things that match or are like your address1, then paring down from that if necessary.

Re: duplicate key check isn't, why not?
by graff (Chancellor) on Nov 23, 2008 at 20:37 UTC
    I don't know whether you've fully solved your problem by now, but just in terms of understanding the error message, this part:
    Use of uninitialized value in concatenation (.) or string at parse_vot +er_list.pl line 251
    happened because of this logic in your code ("execute" call simplified for legibility):
    if(!defined($mlg_addr_id) && length($fields[20]) == 2){ $sth_insert_mlg_addr->execute(@fields[17..23]) or warn "While inserting new mailing address |$fields[17]|; |$ +fields[19]| |$fields[20]|, with id |$mlg_addr_id|: " . $dbh->errstr;
    Note how you are including $mlg_addr_id in your error message, specifically on condition that it happens to be undefined. What's the point of that, and how does the undefined variable relate to the "insert_mlg_addr" query being executed in this condition?

    I'm not sure I can comment coherently on the rest of your details, but if you haven't heard about "ON DUPLICATE KEY UPDATE ...", you might want to look that up, just in case it's relevant.

Re: duplicate key check isn't, why not?
by eric256 (Parson) on Nov 25, 2008 at 17:33 UTC

    First think I do with code I don't understand is reformat it ;) So just in case its usefull to you i've included the reformatted code. All I really did was use slices so the arrays are readable and then edited the error messages to look similar. Once done it was fairly obvious that the error was on line #14 and that its using $mlg_addr_id which you distinctly specified shouldn't be defined. BTW i don't you sequnce of duplicate checks at all, it might make since but its hard to tell from this small sample of code.

    $sth_mlg_addr_duplicate_check->execute(@fields[17,18,19,20,21,22,23]) or warn "While checking for duplicate mailing address " . "$fields[17]; $fields[19] $fields[20]: " . $dbh->errstr; ($mlg_addr_id) = $sth_mlg_addr_duplicate_check->fetchrow_array(); my $mlg_id; if(!defined($mlg_addr_id) && length($fields[20]) == 2) { #insert $sth_insert_mlg_addr->execute(@fields[17,18,19,20,21,22,23]) or warn "While inserting new mailing address " . "|$fields[17]|; |$fields[19]| |$fields[20]|, with id |$m +lg_addr_id|: " . $dbh->errstr; #check duplicate $sth_mlg_addr_duplicate_check->execute(@fields[17,18,19,20,21,22,23 +]) or warn "While checking for duplicate mailing address " . "|$fields[17]|; |$fields[19]| |$fields[20]|: " . $dbh->errstr; ($mlg_addr_id) = $sth_mlg_addr_duplicate_check->fetchrow_array(); $mlg_id = $mlg_addr_id; } else { $mlg_id = $mlg_addr_id; }


    ___________
    Eric Hodges