in reply to Re: Equality checking for strings AND numbers
in thread Equality checking for strings AND numbers

Thanks for the warning - all numerical values will be base10, sometimes in scientific format, so the looks_like_number call should work in this case.

So, looks_like_number only works for base10 (and below) numbers i.e. hexadecimal values with or without a trailing 0x will return false?

Although, of course not knowing the number base for numerical values will cause all kinds of other problems! ;)
  • Comment on Re^2: Equality checking for strings AND numbers

Replies are listed 'Best First'.
Re^3: Equality checking for strings AND numbers
by rpanman (Scribe) on Jul 13, 2007 at 07:58 UTC
    On another note you could use Algorithm::Diff which would allow you to provide your own matching (or "key generation") function as they call it. This gets over the deficiencies of Text::Diff in only comparing text strings.
Re^3: Equality checking for strings AND numbers
by rpanman (Scribe) on Jul 13, 2007 at 06:32 UTC
    Looking at the Text::Diff module, I noticed the following:
    my $diff = diff \&reader1,\&reader2;
    I assume that this means you can use a subroutine to return the column you need from the input files and then just use Text::Diff to compare.

    Do you have some sample input files? What sort of output are you expecting to be generated (a list of the differences, print to screen etc) and what should the format of this output be??
    Updated: Questions added
      Here is some sample input.
      File1 ----- X Y Category1 Category2 Value1 Value2 Value3 Result 2 -9 1.0 2.0 1.1e3 1.234 -0.003 PASS File2 ----- X Y Category2 Category1 Value3 Value1 Value2 2 -9 1 1 -0.003 1.1e3 1.2345 FAIL

      The main points to note are:
      1. The columns are not necessarily in the same order (the main reason I started this in the first place).
      2. Data is a mixture of strings and numbers of differing precisions (but all base10).

      To take account of the differences in column order, I rebuild the data using a hash keyed by the XY coordinate (first 2 columns) and using the actual column name e.g
      ...other code omitted... $data1{"$x,$y}{$colnames[$colnum]} = $linedata[$colnum];


      Output would be something like:
      X=2 Y=9 Category2
      X=2 Y=9 Value2
      X=2 Y=9 Result


      i.e. a list of the column names for which the data did not match between the two files, allowing for strings an numerical values, and accuracy to a certain precision (using the $eps approach detailed elsewhere in this thread)