Ovid has asked for the wisdom of the Perl Monks concerning the following question:

Also posted to my use.perl journal and the Perl-QA list.

There are many times when we test string data but whitespace issues give us false negatives. For example, consider the following two SQL snippets:

CREATE TABLE foo ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(32) NOT NULL, age INTEGER );

And:

create table foo ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(32) NOT NULL, age integer );

Those are functionally equivalent but there are a few potential problems with testing it. I can get a reasonably easy to debug test if I do the following:

I can use Data::Record to easily split the string on spaces without affecting quoted characters:

use Regexp::Common; use Data::Record; my $record = Data::Record->new({ split => qr/\s\t/, unless => $RE{quoted}, }); my $data = join ' ', map { lc $_ } $record->records($data);

That doesn't seem flexible enough, though. It would be useful to wrap this in a test module whereby one can control whether or not one wants to alter the case, ignore quoted data, preseve newlines, and so on. I could make a plethora of test functions, but the number of possible combinations would make them unweildy. I could have the user set the parameters at the top of the test and change the parameters as needed. This would make this far more flexible than simply an SQL tester. Thoughts?

I'm thinking a name like Test::ControlWhitespace.

Cheers,
Ovid

New address of my CGI Course.

Replies are listed 'Best First'.
Re: Whitespace issues in testing
by dragonchild (Archbishop) on Feb 28, 2006 at 01:42 UTC
    You really need a domain parser. For example, your SQL looks like:
    create table foo ( id INTEGER NOT NULL PRIMARY KEY, name VARCHAR(32) NOT NULL, age integer );

    My version would look like:

    create table foo ( id INTEGER NOT NULL PRIMARY KEY ,name VARCHAR(32) NOT NULL ,age integer );
    What you really want is Test::SQL, not Test::ControlWhitespace.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

      Some people would argue that defining a primary key to be 'NOT NULL' is redundant as well, so a statement without it should be considered the same. (I'm not sure if that's a function of the database, though, which might introduce extra complexity).

      PS. back when I took Oracle's SQL Tuning class (back in 2001 or so), we were told that the caching only worked for exact string matches -- any whitespace or capitalization changes would be considered a different query to the optimizer.

Re: Whitespace issues in testing
by xdg (Monsignor) on Feb 28, 2006 at 02:43 UTC

    I agree with dragonchild that this is very domain specific.

    Moreover, I wonder whether there's really a sufficiently general test-module here given all the combinations. In most cases, I suspect that a short user function at the top of the test file or in a fixture file would be clear and avoid adding an extra dependency.

    Alternatively, instead of writing a new module, how about adding these as options to Test::Differences? Or perhaps adding a generalized callback to filter input before making the comparison?

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      Ooh, the callback filter is a great idea. I think I should write up a patch and submit it. That's much better. Thanks!

      Cheers,
      Ovid

      New address of my CGI Course.

      I too vote for the callback filters. I put these into Test::Files. If you use the feature, your filter gets every line right before Algorithm::Diff sees it. It works great. I originally needed it to strip dates out of the machine generated files, since the files were made at different times and I only cared about the content not the time difference.

      Phil