in reply to Re: First word
in thread First word

Unfortunately, the definition of the \w character class does not match what natural languages consider as "word characters". See also perlre.

Example:

/tmp>cat 1174444-mod.pl #!/usr/bin/env perl use strict; use warnings; my @AoA = ( ['first word', 'greek latin'], ['alpha omega', 'beta test'], ["don't forget", "can't work", "won't fix" ], ["Kindergärten Kindergarten"] ); my @firsties; for my $outer (@AoA) { for my $inner (@$outer) { push @firsties, $inner =~ /^(\w+)/; } } print "@firsties\n"; /tmp>perl 1174444-mod.pl first greek alpha beta don can won Kinderg /tmp>perl -v This is perl 5, version 18, subversion 1 (v5.18.1) built for x86_64-li +nux-thread-multi Copyright 1987-2013, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5 source ki +t. Complete documentation for Perl, including FAQ lists, should be found +on this system using "man perl" or "perldoc perl". If you have access to + the Internet, point your browser at http://www.perl.org/, the Perl Home Pa +ge. /tmp>

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^3: First word
by Marshall (Canon) on Oct 22, 2016 at 12:47 UTC
    well, ok, replace,
    $inner =~ /^(\w+)/
    with:
    $inner =~ /^(\S+)/
    and we get:
    print "@firsties\n"; #first greek alpha beta don't can't won't Kindergärten
    Good point.
    \w means the characters that can be used within a Perl identifier [0-9_A-Za-z]
    Sometimes, as in this case, \S (not a space) is useful.