Re: Split a sentence into words

That reminds me of Abigail's (in)famous "determine if a number is prime using a regex" hack. Where I think that akho's Re: Split a sentence into words falls short, is that I think you ought to be using Perl's backtracking mechanism to make it fit anyway. It's not because parts match, that it'll match as a whole.

Here is a dumb, straightforward approach:

# your data
my @vocabulary = qw(a abc abcd abd bc); 
my $sentence = 'abdaabc';
# regex for words
my $re = join '|', @vocabulary;
# does it match in *any* way
my $success = $sentence =~ /^(?:$re)*$/;
# show result
print $success || 0;
[download]

That displays a rather disappointing yet for a start, encouraging result:

1
[download]

Okay, so it matches, but we have no idea how. We should find a way to mark where it matches, and where it backtracks.

But after that, I'm lost. I've tried (??{code()) in various combinations, trying to capture intermediate state of the regex, including $^R, pos, @- and @+, but I don't get any intuitive results... maybe somebody else can take it up from here?

The only other thing that yields interesting results, is

use re 'debug';
[download]

but it's not something you can use in a script.

Comment on Re: Split a sentence into words Select or Download Code