Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

One quick question... if:
$string =~ s/[^0-9+]//g;
will strip everything but numbers out, with this:
$string =~ s/[^a-zA-Z0-9+]//g;
strip everything except numbers and letters? Like .=+/\?^*()@!$% out? and Whitespaces? Thanks

20090206 Janitored by Corion: Added formatting, code tags, as per Writeup Formatting Tips

Replies are listed 'Best First'.
Re: stripping everything but numbers and letters
by linuxer (Curate) on Feb 06, 2009 at 11:35 UTC

    As you have + inside your character class, it will not be replaced.

    my $s = 'abc123+'; $s =~ s/[^0-9+]//g;

    Maybe you wanted a quantifier, which should be placed directly after the character class.

    $s =~ s/[^0-9]+//g;

    You can also use tr///:

    $s = 'abc123+'; # remove any non-digit $s =~ tr/0-9//cd; # or remove any non-alphanumeric $s =~ tr/A-Za-z0-9//cd;

    See perldoc perlop for details (Search for 'tr/SEARCHLIST/REPLACEMENTLIST/cds' ).

    Update

    1. removed /s from tr/// code; thanks jwkrahn

      Using the /s option with /d makes no sense.   You can't Squash something that has been Deleted.

        Thanks. Code is modified accordingly.

      As you have + inside your character class, it will not be replaced

      Works for me:
      my $var = 'abcdef56+67jkkjk'; $var =~ s/[0-9+]//g; print "$var\n";
      Gives:
      abcdefjkkjk

        Did you notice, that my code uses a negated character class?

        If the character class is not negated, the + is replaced; if the class is negated, it is not.

Re: stripping everything but numbers and letters
by Anonymous Monk on Feb 06, 2009 at 11:37 UTC
    Yes, just like s/[^[:alnum:]]//g;. But if you're dealing with unicode, you'll want s/[^\p{N}\p{L}]//g .
Re: stripping everything but numbers and letters
by leocharre (Priest) on Feb 06, 2009 at 15:21 UTC

    What's wrong with \W ?

    $string=~s/\W//g
    Because underscore is also word char?
Re: stripping everything but numbers and letters
by cdarke (Prior) on Feb 06, 2009 at 11:39 UTC
    Note that adding a + inside will also strip out +, is that intentional as a sign? If so, what about . (decimal point) and - (minus)?

    The are various character classes, and shortcuts like \s (whitespace) and \d (decimal). Of the POSIX character classes [:alnum:] might be most relevant.

    To improve it you can always consider comments ;-)
      \d is digit, not number or decimal, but digit, 0-9.
Re: stripping everything but numbers and letters
by Anonymous Monk on Feb 06, 2009 at 11:25 UTC
    Tested it.
    It works, but what I meant to ask, is if there is a down fall to doing it that way, or if there is a better way to do it?

      I might use this, $string =~ s/[^\w\d.-]+//g, but it's nearly the same thing. I think it's just fine.

      -Paul

      I don't understand: When you said 'number' in your OP, did you mean 'decimal digit', i.e., '0' through '9' (as represented by the character set [0-9] or \d), or did you mean something like 'signed integer' or 'rational number', e.g., '-21' or '+4.2'?

      I notice that you and some other posters include '+', '.' and '-' in their character sets, which suggests a 'rational number' interpretation. If this is the case, you definitely need a different approach.

Re: stripping everything but numbers and letters
by Anonymous Monk on Feb 06, 2009 at 20:49 UTC
    $string =~ tr/a-zA-Z0-9//cd;