Here are a couple of similar approaches that differ in using a regex feature introduced in Perl version 5.10. These regexes may be a bit beyond your current regex understanding; if so, this may serve as a spur. I think the main idea to take away is of building a unit testing framework in which different approaches (to what looks like your homework) can be tried. (The original file contains other variations, but I think the ones I've shown are the best.) See Test::More and friends.

File proper_article_1.pl:

use warnings; use strict; use Test::More # tests => ?? + 1 # Test::NoWarnings adds 1 test 'no_plan' ; use Test::NoWarnings; FUNT: for my $func_name (qw(proper_2a proper_2b)) { note "\n==== testing $func_name() ====\n\n"; *proper = do { no strict 'refs'; *$func_name; }; VECTOR: for my $ar_vector ( [ 'An log and an axe, a log and a axe', 'A log and an axe, a log and an axe', ], [ 'An log and an axe, a log and a axe a', 'A log and an axe, a log and an axe a', q{end in 'a'} ], [ 'An log and an axe, a log and a axe an', 'A log and an axe, a log and an axe an', q{end in 'an'} ], [ 'An log and an axe, a log and a axe a ', 'A log and an axe, a log and an axe a ', q{end in 'a' then whitespace (is this possible?)} ], [ 'An log and an axe, a log and a axe an ', 'A log and an axe, a log and an axe an ', q{end in 'an' then whitespace (is this possible?)} ], [ 'An log and an axe, a log and a axe a.', 'A log and an axe, a log and an axe a.', q{end in 'a' then non-whitespace/alpha} ], [ 'An log and an axe, a log and a axe an.', 'A log and an axe, a log and an axe an.', q{end in 'an' then non-whitespace/alpha} ], [ 'AN LOG AND AN AXE, A LOG AND A AXE', 'A LOG AND AN AXE, A LOG AND An AXE', # ^ delete ^ existing, ok ^ added 'leave case untouched if not adding' ], [ 'An man, an plan, an canal, Panama', 'A man, a plan, a canal, Panama', ], [ 'A man, a plan, a canal, Panama', 'A man, a plan, a canal, Panama', 'no change needed' ], [ 'Afganastan banana stand and aqua okra, tan pan.', 'Afganastan banana stand and aqua okra, tan pan.', 'no articles' ], ) { if (not ref $ar_vector) { note $ar_vector; next VECTOR; } my ($sentence, $expected, $msg) = @$ar_vector; is proper($sentence), $expected, $msg; } # end for VECTOR } # end for FUNT note "\n==== done testing functions ====\n\n"; # function(s) under test ########################################### sub proper_2a { # compatible with version 5.8.9 my ($sentence, ) = @_; my $n = qr{ [Nn] }xms; my $vowel = qr{ [AEIOUaeiou] }xms; # 'n', if present, must be preceded by # 'a' that is NOT preceded by a NON-whitespace character # (i.e., must be preceded by whitespace or start of string). $sentence =~ s{ (?<= (?<! \S) [Aa]) ($n?) (?= \s+ (?= .) ($vowel?) +) } { $2 ? ($1 ? $1 : 'n') : '' }xmsge; return $sentence; } sub proper_2b { # needs version 5.10+ for \K my ($sentence, ) = @_; my $Aa = qr{ [Aa] }xms; my $Nn = qr{ [Nn] }xms; my $vowel = qr{ [AEIOUaeiou] }xms; # 'n', if present, must be preceded by # 'a' that is NOT preceded by a NON-whitespace character # (i.e., must be preceded by whitespace or start of string). $sentence =~ s{ (?<! \S) $Aa \K ($Nn?) (?= \s+ (?= .) ($vowel?)) } { $2 ? ($1 ? $1 : 'n') : '' }xmsge; return $sentence; }
Output:
c:\@Work\Perl\monks\MadhAric>perl proper_article_1.pl # # ==== testing proper_2a() ==== # ok 1 ok 2 - end in 'a' ok 3 - end in 'an' ok 4 - end in 'a' then whitespace (is this possible?) ok 5 - end in 'an' then whitespace (is this possible?) ok 6 - end in 'a' then non-whitespace/alpha ok 7 - end in 'an' then non-whitespace/alpha ok 8 - leave case untouched if not adding ok 9 ok 10 - no change needed ok 11 - no articles # # ==== testing proper_2b() ==== # ok 12 ok 13 - end in 'a' ok 14 - end in 'an' ok 15 - end in 'a' then whitespace (is this possible?) ok 16 - end in 'an' then whitespace (is this possible?) ok 17 - end in 'a' then non-whitespace/alpha ok 18 - end in 'an' then non-whitespace/alpha ok 19 - leave case untouched if not adding ok 20 ok 21 - no change needed ok 22 - no articles # # ==== done testing functions ==== # ok 23 - no warnings 1..23


Give a man a fish:  <%-(-(-(-<


In reply to Re: Perl pattern matching question by AnomalousMonk
in thread Perl pattern matching question by MadhAric

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.