Also, \w+ means month names must be at least four chars long. Also, doesn't handle upper or mixed case, nor does it properly handle long month names as the OP seems to want.

perl -wMstrict -e "my @months = (qw(jan feb mar apr may jun jul aug sep oct nov dec)); my $i = 1; my %month_num = map { $_ => $i++ } @months; my $month_re = join '|', @months; print qq(o/p: \n); for (@ARGV) { print qq( $_: ); s/($month_re)\w+/$month_num{$1} || $1/ei; print qq($_ \n) }" jan JaN january jane o/p: jan: jan JaN: JaN january: 1 jane: 1

Following is better, but still doesn't handle long month names properly (IMO).

perl -wMstrict -e "my @months = (qw(jan feb mar apr may jun jul aug sep oct nov dec)); my $i = 1; my %month_num = map { $_ => $i++ } @months; my $month_re = join '|', @months; print qq(o/p: \n); for (@ARGV) { print qq( $_: ); s/($month_re)\w*/$month_num{lc $1} || $1/ei; print qq($_ \n) }" jan JaN january jane o/p: jan: 1 JaN: 1 january: 1 jane: 1

Here's my suggestion. Not a one-liner, but seems to fill the requirement.

use warnings; use strict; MAIN: { my $month_rx = month_names_regex(); while (<DATA>) { s{ ($month_rx) }{ "$1 (@{[ month_number($1) ]})" }xmseg; print; } } BEGIN { # compile-time initialized closure for month-number regexes my %months; # start with long names of months, pair with month number strings. # month numbers: two digits with leading zero if needed. # all month names in common lower case. @months{ qw( january february march april may june july august september october november december ) } = map { sprintf '%02d', $_ } 1 .. 12; # generate short month names, pair with long month numbers. my $months_re = join ' | ', map { my ($short_name, $regex) = mon_split($_); $months{$short_name} = $months{$_}; $regex; } keys %months ; # return final long/short month name regex. sub month_names_regex { return qr{ \b (?: $months_re) \b }xmsi } # convert long/short month name to month number. sub month_number { my $name = shift; return $months{lc $name} } # printf "\$months_re is %s \n", month_names_regex(); # FOR DEBUG # print "$_ => $months{$_} \n" for keys %months; # FOR DEBUG sub mon_split { my ($mon_name, ) = @_; my ($head, $tail) = $mon_name =~ m{ \A (\w{3}) (\w*) \z }xms; die "malformed month name $mon_name" unless $head; $tail = qr{ (?: $tail)? }xmsi if $tail; return (lc $head, qr{ $head $tail }xmsi); } } # end compile-time initialized closure for month-number regexes __DATA__ jan january february feb mar march jUnE JuN JuLy JUL xMaY xmArx Marx xmay xmayx mayx xjunE xjunex xJUNEx xjun xJUNx jUnx and so on

Output:

jan (01) january (01) february (02) feb (02) mar (03) march (03) jUnE (06) JuN (06) JuLy (07) JUL (07) xMaY xmArx Marx xmay xmayx mayx xjunE xjunex xJUNEx xjun xJUNx jUnx and so on

In reply to Re^2: Quick search-and-replace month names for numbers by Anonymous Monk
in thread Quick search-and-replace month names for numbers by kangaroobin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.