dwhite20899 has asked for the wisdom of the Perl Monks concerning the following question:

I need to convert an abbreviated number to an integer.

A quick look through CPAN and the Cookbook turned up nothing, but before I dive into this, can anyone recommend a module or a piece of code?

I have to convert "30k" to 30000 and "30kb" to (30 * 1024). "k", "g", "kb" and "gb" are all I need for now, case insensitive.

There's got to be a terrifying, elegant RE to do this...

Replies are listed 'Best First'.
Re: converting "30k" string to integer
by jdporter (Paladin) on May 15, 2003 at 14:21 UTC
    Perhaps not the most elegant, nor the most terrifying, but...
    s((\d+)([kmg]b?)){ $1 * ({ k => 1000, kb => 1024, m => 1000*1000, mb => 1024*1024, g => 1000*1000*1000, gb => 1024*1024*1024, })->{lc $2} }ie;
    Updated - For case insensitivity, I simply added an i modifier on the regex, and of course the lc on the $2.

    jdporter
    The 6th Rule of Perl Club is -- There is no Rule #6.

      99.44% pure!

      I modded it to have ([kKmMgG]b?B?) and {lc($2)} for case insensitivity and it's just what I needed. Thanks!

      If speed becomes an issue, I'll check out the other suggestions here - thanks to all!

        You might want to use ([kKmMgG][bB]?) instead of ([kKmMgG]b?B?). Your version will recognize "3gbB" as well as "3gB", although the first version is not a key in the hash. Its undef value will be converted to 0, so that you will have that string replaced with "0".
Re: converting "30k" string to integer
by davorg (Chancellor) on May 15, 2003 at 14:20 UTC

    Something like this perhaps:

    #!/usr/local/bin/perl use strict; use warnings; my %conv = (k => 1000, kb => 1024, g => 1000000, gb => 1024 * 1024); # Note that the keys are sorted to put the longer # matches first in the regex my $keys = join '|', sort { length $b <=> length $a } keys %conv; while (<DATA>) { print "$_ -> "; s/(\d+)($keys)/$1 * $conv{$2}/e; print "$_\n"; } __END__ 30k 30kb 10g 10gb

    This gives:

    $ ./test.pl 30k -> 30000 30kb -> 30720 10g -> 10000000 10gb -> 10485760

    Update: Yes I know that some of the multipliers are wrong. Correcting them is left as as exercise for the reader :)

    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: converting "30k" string to integer
by Mr. Muskrat (Canon) on May 15, 2003 at 14:20 UTC
Re: converting "30k" string to integer
by hardburn (Abbot) on May 15, 2003 at 14:22 UTC

    I'd store multipliers in a hash, which is keyed by the end of the string:

    my %MULTIPLIERS = ( k => 1024, kb => 1024, g => 1024 * 1024 * 1024, gb => 1024 * 1024 * 1024, ); my $num = '30k'; # or 24g or 29kb, whatever $num =~ s/\A (\d+) ([A-Za-z]{0,2}) \z)/$1/x; my $key = lc $2; $mult = exists $MULTIPLIER{$key} ? $MULTIPLIER{$key} : 1; $num *= $mult;

    Update: added lowercase key so that 'g' and 'G' evaluate to the same hash key.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: converting "30k" string to integer
by dragonchild (Archbishop) on May 15, 2003 at 14:19 UTC
    Try breaking up "30k" into "30" and "k" (using a regex), then converting "k" into "1000" (using a hash), then multiplying "30" and "1000" (using arithmetic).

    I've done the hard work (the design). The easy stuff (implementation details) are left as an exercise for the reader.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: converting "30k" string to integer
by hv (Prior) on May 15, 2003 at 14:28 UTC

    I'd go for something like this (untested):

    my %abbr = ( 'k' => 1e3, 'kb' => 1 << 10, 'g' => 1e9, 'gb' => 1 << 30, ); my $match = join '|', keys %abbr; ... and later $text =~ s/(\d+)($match)\b/$1 * $abbr{lc $2}/gie;

    Note that since the pattern requires a word boundary following the abbreviation, it doesn't matter that the alternatives might try to match "k" before "kb". (Otherwise it would have needed a way to avoid that, eg by sorting the keys in reverse ASCII order.)

    If you need to cope with more than just integers, it'd probably be worth looking in Regexp::Common for a way to match the numbers you want.

    Hugo
Re: converting "30k" string to integer
by CombatSquirrel (Hermit) on May 15, 2003 at 14:57 UTC
    My attempt uses the fact that "gb" means "1024 ** 3", whereas "g" means "1000 ** 3", so there is just a diffrence in the bases, whereas the exponents are the same. I just store the single-letter exponent identifiers in a hash as keys with their corresponding exponents as values and do the same with the base identifiers ("" for decimal and "b" for binary) and then do the rest with a regex:
    #!perl -ws my %exponents = ( 'k' => 1, 'm' => 2, 'g' => 3, 't' => 4 ); my %bases = ( '' => 1000, 'b' => 1024 ); while (<DATA>) { chomp; my $old = $_; s<(\d+)([a-zA-Z])([a-zA-Z]?)\b> < (defined $exponents{lc($2)} and defined $bases{lc($3)}) ? # suff +ix recognized ? $1 * ($bases{lc($3)} ** $exponents{lc($2)}) # substitute if y +es : "$1$2$3" # otherwise just +leave it >eg; print "`$old' became `$_'\n"; } __DATA__ My dog makes $30k a year. I own a 20gb hard drive. The budget deficite is $10g. 18h is 24d.
    The output is as expected (the last data line does not change).
Re: converting "30k" string to integer
by robobunny (Friar) on May 15, 2003 at 18:48 UTC
    when i read your title, i first thought you meant that you had a 30 kilobyte string that you wanted to convert to an integer. i don't think perl supports 100000 bit math...
Re: converting "30k" string to integer
by DrHyde (Prior) on May 16, 2003 at 16:01 UTC
    #!/usr/bin/perl -w use strict; my($amt, $units, $availableunits) = @ARGV[0,1], 'kmgt'; $units =~ s/^([$availableunits])(b)?$/ (1000*($2?1.024:1))**index("X$availableunits",$1) /exi; print $amt * $units; print "\n";
    Give two arguments on the command line, the first being a number, the second being k, m, g or t, with an optional trailing b.