in reply to classifying data

If you're just trying to come up with a numeric/non-numeric classification, one regex *should* be okay for most cases.
$value = undef; $value = (0 + "$1$3") if $thing =~ m/ ^ (\-|\+)? # optional sign: $1 (\$)? # optional dollar sign: $2 ( \d+ # at least one digit (,\d\d\d)* # zero or more comma groups (\.\d*)? # optional fractional part | (\.\d+) # only a fractional part ) # the whole mantissa: $3 $ /x; print "numeric! value = $value\n" if defined $value;
I haven't tested this, but it should cover all the basic cases without scientific, but assumes commas are thousands-separators and the decimal point is the fraction separator. You might want to be lenient about leading and trailing spaces, or dollar-before-sign ($-34.00) cases.

--
[ e d @ h a l l e y . c c ]

Replies are listed 'Best First'.
Re: Re: classifying data
by rir (Vicar) on Jan 19, 2004 at 20:21 UTC
    This seems to accept 123456,123.00.

    Be well.

      Easily fixed, just give the first digit group an explicit count:
      \d{1,3} # at least one digit


      --
      Spring: Forces, Coiled Again!
      Yes, \d+(,\d{3})* will accept "123456,123.00". Perl accepts 123456_123.00 as one number, also. If you wish not to be so accepting, then you may have to deal with more than two choices in the key alternation. The suggested expression \d{1,3}(,\d{3})* would reject "123456123.00", since it lacks commas.

      --
      [ e d @ h a l l e y . c c ]