mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:
I'm trying to build a pattern that will extract acronyms. I've gotten fairly close, but it leaves off the final period where it should be saved. Notice that it is missing from the output. For example, "X.H.T.M.L" should be "X.H.T.M.L."
I wrote the pattern below in a moment of insight and now that moment has passed. How may I modify it to preserve the trailing periods?
#!/usr/bin/perl use strict; use warnings; while (<DATA>) { chomp; while( s/(?'foo' ( (?=([[:upper:]]\.\s){2})[[:upper:]\.\s]{2,} | (?=([[:upper:]]\s){2})[[:upper:]\s]{2,} | (?=([[:upper:]]\.){2})[[:upper:]\.]{2,} | [[:upper:]] ){2,} )//x ) { my $acronym = $+{foo}; print qq("$acronym"\n); } } exit(0); __DATA__ L F and LF and L.F. and L. F. and not L, F. some HTML some XML. or X.H.T.M.L. or X. H. T. M. L. or even X H T M L but not U and I, or You and I. ...
Or if there is an existing function or module which does that already, I can use that instead.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regular expression for finding acronyms
by jcb (Parson) on Aug 16, 2019 at 04:33 UTC | |
by mldvx4 (Hermit) on Aug 16, 2019 at 05:05 UTC | |
by jcb (Parson) on Aug 16, 2019 at 05:12 UTC | |
by mldvx4 (Hermit) on Aug 16, 2019 at 06:57 UTC | |
|
Re: Regular expression for finding acronyms
by Marshall (Canon) on Aug 16, 2019 at 02:34 UTC |