in reply to DBI returns NULL characters in string

Any time you encounter mystery characters in your data, you should do a hex dump so that you know exactly what those characters are (guessing can work, but can bite you later if you guess wrong). If you haven't done so, try replacing this:
@test = split (//,$something); foreach (@test) {print "$_\n";}
with:
for (split //, $something) { printf "%02X ", ord; }
Given the input "J\x00O\x00H\x00N\x00" this will print:
4A 00 4F 00 48 00 4E 00
With that said, as far as I can tell, if your string contains embedded NULLs, your browser should interpret them as whitespace, so that it appears that there is a single space where each NULL character would be expected to appear. This would render your example as: J O H N

If you're not seeing this behavior when interfacing with the CGI, but you are seeing it in the shell, you should probably do some testing to find out if the mystery characters show up in both environments. If not, then there's a problem with the environment in which your script is running (when running from the shell).

Replies are listed 'Best First'.
Re: Re: DBI returns NULL characters in string
by jcbyrne (Acolyte) on Apr 24, 2001 at 22:52 UTC
    I've verified that the mystery characters are indeed NULLS,
    and they only appear when running the script from the
    command line. Running it as a CGI script returns clean data.
    
    I've started looking into the differences in the
    environments, but so far have come up empty.
    
    Does anyone have any thoughts on what type of environment
    diffs could cause something like this?
    
    Thanks again.
    
    
      I just spotted this little tidbit in Programming the Perl DBI on page 299:

      The NVARCHAR2 and NCHAR variants hold string values of a defined national character set (Oracle 8 only). For those types the maximum number of characters stored may be lower when using multibyte character sets.

      ...

      Oracle automatically converts character data between the character set of the database defined when the database was created and the character set of the client, defined by the NLS_LANG parameter for the CHAR and VARCHAR2 types or the NLS_NCHAR parameter for the NCHAR and NVARCHAR2 types.

      CONVERT(string, dest_char_set, source_char_set) can be used to convert strings between character sets... See the "National Language Support" section of the Oracle Reference Manual for more details on character set issues.

      I have a hunch that the difference between the two environments has something to do with the NLS_LANG and/or NLS_NCHAR parameters. (Disclaimer: I am not a dba, so I could be barking up the wrong tree.)

        You are correct, it is the NLS_LANG parameter that caused my grief. I am working on a development/test server and my DBA recently deleted the directory that NLS_LANG pointed at, in order to prepare to load Oracle 9i. The CGI scripts apparently work OK since that web server was last started when the directory still existed.
        A thousand thank yous.