in reply to Regex matching words with numbers, but not numbers.

The problem just gets ugly quickly, as I like to allow words to contain minus signs, underscores, umlauts, and so on.

"and so on" is a rather nebulous specification, but I'd suggest using Unicode character classes, e.g. like this:

#!/usr/bin/perl use strict; use warnings; use feature qw/say/; use utf8; use open IO => ':encoding(UTF-8)', ':std'; my $wordchars = qr/[\pL\pP\pS]/; my $regex = qr/\p{Nd}*$wordchars+\p{Nd}*/; while(<DATA>) { chomp; (my $string = $_) =~ s/$regex//g; $string = join " ", split " ", $string; # just to make the output +more readable say "'$_' became '$string'"; } __DATA__ foo 1foo foo2 3foo4 foo5bar 87 foo 1foo; foo_2 foo-bar() 87 - _ !@#$% augu mín sáu þig 12345

See perluniprops for more on Unicode properties. Also see Unicode::Tussle for a bunch of useful scripts for Unicode wrangling, by Tom Christiansen; uniprops is particularly useful.