comment on

Note 'saxyzzy' and'Wen' in the output:

Win8 Strawberry 5.30.3.1 (64)  Tue 07/26/2022  6:46:09
C:\@Work\Perl\monks
>perl
use strict;
use warnings;

my $input = q/saxyzzy Monday Saturday Thursday Saturday Sat Sun Mon Tu
+e Wen Th/;
my $re =
    join "|",
    map ".*\\b($_\\S*)",
    map m/(..)/,
    qw/monday tuesday wednesday thursday friday saturday sunday/
    ;
print "\$re '$re' \n";
print join '.', $input =~ m/^(?|$re)\b(?{ print "$1\n" })(*FAIL)/i;
^Z
$re '.*\b(mo\S*)|.*\b(tu\S*)|.*\b(we\S*)|.*\b(th\S*)|.*\b(fr\S*)|.*\b(
+sa\S*)|.*\b(su\S*)'
Mon
Monday
Tue
Wen
Th
Thursday
Sat
Saturday
Saturday
saxyzzy
Sun
[download]

This can be avoided with more complete automatic pattern generation, but some false positives will remain, e.g., 'satu' for Saturday.

File sort_weekdays_2.pl:

# sort_weekdays_2.pl

# method relies on left-anchored nature of "short" day names,
# e.g., th/thu/thur/thurs/thursday.
# automatic pattern generation.  possibility of false-positive
# matches exists, e.g., 'wedne' for wednesday.

use 5.018;  # need reliable regex interaction with lexicals
use 5.010;  # need regex extended patterns (?|...) branch reset

use strict;
use warnings;

use Test::More 'tests' => 2;
use Test::NoWarnings;

use Data::Dump qw(dd);

# works - false positive matches possible.
my ($rx_days) =
    map qr{ (?i) $_ }xms,
    join ' | ',
  # map { dd $_;  $_; }  # for debug
    map ".*? \\b (${ \day_parts($_) })",  # works
  # map { dd $_;  $_; }  # for debug
    map { @$_ == 3 or die "bad day '$_'";  $_; }  # minimal validation
    map [ m{ \A ([motuwehfrsa]{2}) ([nesdritu]+?) (day) \z }xms ],  # 
+works
  # map [ m{ \A ([a-z]{2}) ([a-z]+?) (day) \z }xms ],  # works
    qw/monday tuesday wednesday thursday friday saturday sunday/
    ;
# print "\$rx_days $rx_days \n"; # for debug

my $test = <<'EOS';
Monday Saturday Thursday Saturday Sat Sun Mon Tue Th
nothing on this or the following line should extract
shdhsd mond s mondaytuesday mondayxtuesday xsun sunx xsunx saxyzzy Wen
mo tu we th fr sa su  mon tue wed thu fri sat sun

blank lines don't matter

mOnDaY TuE wEd ThUrS mO sUn

false positive matches are possible with automatic pattern generation
satu wednes

EOS

# day-names extracted in day order, then by order of
# appearance in the source string.
# (?{ ... }) interface to lexical variable only reliable for perl ver 
+5.18+
# this problem can be avoided by using package-global variables.
my @got;  # lexical unreliable in older perls
$test =~ m{ \A (?| $rx_days) \b (?{ push @got, $^N }) (*FAIL) }xms;
# dd \@got;  # for debug

is_deeply \@got, [ qw(
    Monday    Mon    mo    mon    mOnDaY    mO
    Tue    tu    tue    TuE
    we    wed    wEd    wednes
    Thursday    Th    th    thu    ThUrS
    fr    fri
    Saturday    Saturday    Sat    sa    sat    satu
    Sun    su    sun    sUn
    )], 'extraction sorted by day-order';

sub day_parts {

    my ($ar_parts,  # substrings to assemble to nested regex
        ) = @_;

    # starts like  [ 'tu', 'es', 'day' ]
    # intermediate ( 'tu', 'e', 's', 'day' )
    # ends like   tu (?: e (?: s (?: day)?)?)?

    my $start = $ar_parts->[0];
    my @parts = (split('', $ar_parts->[1]), $ar_parts->[2]);

    return "$start @{[ _day_parts(@parts) ]}";

    }

sub _day_parts { return @_ ? "(?: @{[ shift, _day_parts(@_) ]})?" : ()
+; }
[download]

False-positive day-name matches can be entirely avoided with a custom regex:

    my $rx_days = qr{ (?i)  # custom patterns
        .*? \K \b mo (?: n (?: day)?)?                 |
        .*? \K \b tu (?: e (?: s (?: day)?)?)?         |
        .*? \K \b we (?: d (?: nes day)?)?             |
        .*? \K \b th (?: u (?: r (?: s (?: day)?)?)?)? |
        .*? \K \b fr (?: i (?: day)?)?                 |
        .*? \K \b sa (?: t (?: ur day)?)?              |
        .*? \K \b su (?: n (?: day)?)?
        }xms;
[download]

My somewhat arbitrary use of \K in this regex means the extraction regex must be changed. This works:
$test =~ m{ \A $rx_days \b (?{ push @got, ${^MATCH} }) (*FAIL) }xmsp;
Use of the ${^MATCH} match variable (and /p modifier) can be avoided by use of the
substr $test, $-[0], $+[0]-$-[0]
expression.

Give a man a fish: <%-{-{-{-<

In reply to Re^5: Challenge: sort weekdays in week-order (grep) by AnomalousMonk
in thread Challenge: sort weekdays in week-order (elegantly and efficiently) by bliako

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.