Hi,

A decade or so ago, I was a conlanger. I liked to construct a language spoken by a virtual people I didn't however, in contrary to J.R.R. Tolkien, shape regarding their culture. Obviously I had a plenty of mind capacity for such things, and they called me quixotic. No more. Now I prefer programing and hobbies and a real life, too. I remember that the grammar part was in favour compared to developing a comprehensive lexicon so the language is talkable. Inflection and word order and features I liked in other languages however were far more interesting. The best part about it I found were its 35 cases, actually two case systems interacting like cogwheels.

I just programmed a word generator as a kind of tribute to that young nerd's passion. It respects how words can be shaped in a language and how not. In English for instance there is no word like quirge, but it still seems more English than, er, dampfschiff which is by the way, believe it or not, actually a word in a spoken language. These unwritten principles are characteristic for every language. Despite of their being unwritten, you can approximate these principles by probabilistic shares and the rand() of perl.

#!/usr/bin/perl use v5.14; use strict; use utf8; binmode STDOUT, ':utf8'; sub wghtrandsel { # selects in a weighted randomized manner my $multices = ref $_[-1] eq 'HASH' ? pop : {}; my @simple = split "", shift//""; while ( my ($num, $mult) = each $multices ) { my @values = ref $mult eq 'ARRAY' ? @$mult : $mult; push @simple, ($_) x $num for @values; } return sub { my $c = $simple[ int rand @simple ]; return (ref $c ? $c->() : $c) // ""; } } sub chained { my @args = @_; return sub { join "", map { ref $_ ? $_->() : $_ } @args }; } my $weak_vowel = wghtrandsel("aaaeeeiiooouu"); my $strong_vowel = wghtrandsel("áóéàòèâôêãõẽíúîû"); # PM bug? ^^^^^^^ => e + ~ my $weak_diphthong = wghtrandsel({ 3 => chained( wghtrandsel("ae"), wghtrandsel({ 5 => "i", 1 => "u" }) ), 2 => chained( "o", wghtrandsel({ 5 => "u", 1 => "i" }) ), 1 => [ "uy", "iw" ], }); my $strong_diphthong = wghtrandsel({ 3 => chained( wghtrandsel("áé"), wghtrandsel({ 5 => "i", 2 => "u" }) ), 1 => chained( "ó", wghtrandsel({ 5 => "u", 3 => "i" }) ), 1 => [ "úy", "úa", "íw", "ía" ], }); my $nonplosive = wghtrandsel("cfjlmnrsvxz", { 1 => ['hn','hr'] }); my $plosive = wghtrandsel("bdgkpqt"); my $initial_consonant = wghtrandsel({ 5 => $nonplosive, 2 => wghtrandsel({ 1 => [qw[ch fh jh sh vh zh h]] }), 3 => chained( $plosive, wghtrandsel({ 6 => undef, 1 => ["w", "y", "l", "r"] }), ), 1 => ["y","w"] }); my $stressed_syllable = chained( $initial_consonant, wghtrandsel({ 2 => $strong_vowel, 1 => $strong_diphthong }), wghtrandsel({ 3 => $nonplosive, 5 => undef }), ); my $stressed_initial_syllable = wghtrandsel({ 3 => $stressed_syllable, 1 => chained( wghtrandsel({ 3 => $strong_vowel, 2 => $strong_diphthong }), wghtrandsel({ 4 => undef, 1 => $nonplosive }) ), }); my $unstressed_syllable = chained( $initial_consonant, wghtrandsel({ 3 => chained( $weak_vowel, wghtrandsel({ 8 => undef, 1 => $nonplosive }) ), 2 => $weak_diphthong, }), ); my $word = wghtrandsel({ 1 => [ $stressed_initial_syllable, $unstressed_syllable ], 4 => chained( wghtrandsel({ 5 => $stressed_initial_syllable, 2 => chained($unstressed_syllable, $stressed_syllable), }), $unstressed_syllable ), }); say $word->() for 1 .. 1000; 1;

If you're interested in the pronunciation of the words the script emits with that inline configuration provided, tell me. Just thought this is a perl board, not one for linguists and conlangers and I didn't want to bore you.

Even if you are not interested in conlanging, you might want to study this example how curried closures can lead to efficient code, i.e. code of which I am quite sure it's efficient, until someone will prove me wrong. For those who do not know yet: Closures are anonymous subroutines that cling, for their own life-time, to any used lexical variables which would have been garbage-collected otherwise on going out of their respective scopes. Higher Order is when you pass these subroutine references to other functions which call them back when appropriate. Learned that once from the High Order Perl book by Mark J. Dominus (available online but if you can effort it, consider buying it).

Update: Corrected name of the Higher Order Perl author. Further confused Currying and Higher Order. Will have to re-read myself the very book I cited.

-- flowdy


In reply to [RFC: Cool Uses for Perl] A weighted randomized word generator for conlangers by flowdy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.