But that's bit un-regex-like. You should use a character class in this situation:$string =~ /((\w|'|-)+)/g
But this also matches dashes and apostrophes at the beginnings or ends of words, which may or may not be what you want. If not, you could force that a dash or apostrophe is between alphanumeric characters:$string =~ /([-\w']+)/g
But this has the unwanted effect that a word is at least 2 characters. So we can add an alternation, saying that we also allow a single character word (or number) if we can't match a word consisting of an alphanumerics with a dash or apostrophe between them:$string =~ /(\w+[-']?\w+)/g
A small complete test-case:$string =~ /(\w+[-']?\w+|\w)/g
Try running this code and compare the output with the sentence in the __DATA__ section.#!/usr/local/bin/perl use strict; use warnings; $/ = undef; my $string = <DATA>; while ($string =~ /(\w+[-']?\w+|\w)/g) { print "Word: <$1>\n"; } __DATA__ This is a sentence with words that're different from other words. They have apostrophes in them (') and dashes, or dash-like characters (-).
The ultimate guide (in my opinion) on regular expressions is Jeffrey Friedl's Mastering Regular Expressions, 2nd Edition.
Arjen
In reply to Re: Regexp explanation
by Aragorn
in thread Regexp explanation
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |