I'm trying to build a pattern that will extract acronyms. I've gotten fairly close, but it leaves off the final period where it should be saved. Notice that it is missing from the output. For example, "X.H.T.M.L" should be "X.H.T.M.L."
I wrote the pattern below in a moment of insight and now that moment has passed. How may I modify it to preserve the trailing periods?
#!/usr/bin/perl use strict; use warnings; while (<DATA>) { chomp; while( s/(?'foo' ( (?=([[:upper:]]\.\s){2})[[:upper:]\.\s]{2,} | (?=([[:upper:]]\s){2})[[:upper:]\s]{2,} | (?=([[:upper:]]\.){2})[[:upper:]\.]{2,} | [[:upper:]] ){2,} )//x ) { my $acronym = $+{foo}; print qq("$acronym"\n); } } exit(0); __DATA__ L F and LF and L.F. and L. F. and not L, F. some HTML some XML. or X.H.T.M.L. or X. H. T. M. L. or even X H T M L but not U and I, or You and I. ...
Or if there is an existing function or module which does that already, I can use that instead.
In reply to Regular expression for finding acronyms by mldvx4
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |