Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Abbreviate english words

by GrandFather (Saint)
on Jun 08, 2005 at 22:47 UTC ( [id://464885]=CUFP: print w/replies, xml ) Need Help??

This subroutine takes a list of words as parameters and returns a list or string containing abbreviated versions of the words. It leaves words of 4 characters or fewer unchanged.

Note: This doesn't just generate truncated versions of a word, it attempts to generate a sensible abbreviation.

Update: Applied Ovid's change. Grab first char before lower casing string to retain case. Removed unused variable. Removed bogus readmore tags.

Sample test code.

my %Words; while (<>) { my @WordList = /(?:([\w']+)\W*)/g; my $Word; foreach $Word (@WordList) { next if $Word =~ /[\d\W_]/; next if length ($Word) <= 4; $Word = lc $Word; $Words {$Word} = Abbreviate ($Word); } } print "$_ => $Words{$_}\n" foreach (sort keys %Words);

Sample output using original version of description text.

abbrev => abrv abbreviate => abrvt abbreviated => abrvtd aeiou => a application => aplctn characters => chrctrs containing => cntnng defined => dfnd excluding => excldng fewer => fwr foreach => frch input => inpt leaves => lvs length => lngth output => otpt parameters => prmtrs print => prnt readmore => rdmr result => rslt return => rtrn returns => rtrns sample => smpl section => sctn shift => shft string => strng subroutine => sbrtn substr => sbstr takes => tks title => ttl unchanged => unchngd using => usng versions => vrsns wantarray => wntry while => whl wordlist => wrdlst words => wrds
sub Abbreviate { my @Result; while (local $_ = shift) { last if ! defined $_; my $Abbrev; $Abbrev = substr $_, 0, 1, ""; if (length ($_) > 4) { tr/A-Z/a-z/; tr/a-z//cd; tr/aeiou//d; s/(.)\1+/$1/gi; s/ck/k/g; s/ptn/pn/g; s/tng/tg/g; s/thr/tr/g; s/vnt/vt/g; s/ltn/ln/g; s/lb/b/g; s/tt/t/g; } $Abbrev .= $_; push @Result, $Abbrev; } return wantarray ? @Result : join " ", @Result; }

Replies are listed 'Best First'.
Re: Abbreviate english words
by Ovid (Cardinal) on Jun 08, 2005 at 23:58 UTC

    Without commenting on the rest of the code, I'd just like to point out that your while loop control is better written as:

        while (local $_ = shift) {...}

    Without the local, you will be stepping on your parent's $_.

    Cheers,
    Ovid

    New address of my CGI Course.

      Ta. I did mean to put a "baby language Perl" warning in the description!

      Perl is Huffman encoded by design.

        No worries. We all have to start somewhere. I actually thought that was rather interesting snippet of code. Welcome to Perl!

        Cheers,
        Ovid

        New address of my CGI Course.

        Wow! Babies from a GrandFather ;-)
Re: Abbreviate english words
by zentara (Archbishop) on Jun 09, 2005 at 10:22 UTC
    This reminds me of "Shorthand", does anyone remember that? I think to make it useful, it would also need a "de_abbreviate" sub so the "management" can read it. :-)

    I'm not really a human, but I play one on earth. flash japh

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://464885]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-03-29 07:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found