anonymized user 468275 has asked for the wisdom of the Perl Monks concerning the following question:
Then in Perl, I check for the duplicate before insertion thus:
This works for most cases, including where there are apostrophes in the business name. But the first time it goes wrong is when the business name is: "1-6 CHAPMANΓÇÖS END MANAGEMENT COMPANY LIMITED" (note this data is from Companies House so is the official business name, so I can't modify it to make my life easier. Basically they will type in any old rubbish when registering a business and it's cast in tablets of stone$incname =~ s/\'/\'\'/g; # change syntax for embedded quotes in Po +stgres $sql = sprintf( 'select count(le_id) ' . 'from legal_entity ' . 'where le_corid = 586 ' # this routine is for data from a +particular registrar . 'and le_lesid = %s ' . 'and le_inc_name = %s', $lesid, "'" . $incname . "'" ); $sth = $dbh->prepare($sql); $sth->execute; my ($count_le) = $sth->fetchrow_array; if ($count_le) { print "$. - existing incname\n"; next LINE; }
The business name was inserted in an earlier run of the Perl script and the above select does not match the string in the database and so control passes after the above code where the database rejects the duplicate business name and so I have to crash and roll back.
I can't simply handle the database error because I will need to match such business names when I write more code. So now is the time to figure out how to ensure I can match strings containing alphabetic diacritics. Clearly, Postgres has stored it differently from how it matches it with select when there are alphabetic diacritics in the string (actually it's character varying of length 256 in Postgres).
Any suggestions of how to match such strings after insertion (when they are exactly what I inserted!) will be most welcome!!!
One world, one people
|
|---|