John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Check out Appendix E in the PGPfone Owner's Manual (e.g. here page 77).

This is like the NATO or Military alphabet, but has a word for each byte.

I want to use this in a program, and didn't find it already done, so I'm coding it up. This is an obvious thing to reuse, so I want to make it presentable. To that end, I'll make it a .pm file with a nice interface, rather than just a function that does exactly what this program needs.

I invite commentary at this point.

What is the interface? Most basic would be to accept a binary string and return a list of words. You can join that to print, or otherwise feed the list to a user interface. Hmm, maybe distinguish list from scalar context and give a single printable string in scalar context?

What should I call the module? I'm wondering if it belongs "under" something.

—John

Replies are listed 'Best First'.
(tye)Re: Biometric Word List -- in a pm file
by tye (Sage) on Jul 02, 2002 at 22:35 UTC

    Well, you could put them under Lingua::EN::BiometricWords just in case someone comes up with versions of this in other languages.

    In addition to the interface you already outlined, I'd suggest an interface to simply fetch the two lists of words and one to construct a byte string from a list of words (noting if the odd/even parity check failed).

    And a nice extra would be an interface to ignore "other" words and construct the resulting byte string. Then the real fun is writing something to take a byte string and produce convincing English text that would produce that string when fed into the previous interface. (:

            - tye (but my friends call me "Tye")
      I've heard of Lingua:: namespace mostly from Damian, I think, but I'm not familiar with what's there already. Nothing of it comes with ActiveState. Is there a CPAN view that shows a tree, rather than just searching or listing alphabetical by category?

      It sounds like a good spot, though. Thanks.

      Other languages: It already came up, earlier today. That would be interesting!

      Fetch the two lists: I just make them EXPORT_OK and documented global lists. If there are selectible word lists, then maybe it shouldn't be under EN but can choose language at run-time?

      This is the code, BTW, starting on line 528 in the file after the word lists.

      sub _mapone { my ($code, $counter)= @_; my $list= (($counter&1) == 0) ? \@two_syllable_words : \@three_syllab +le_words; my $result= $$list[$code]; die unless defined $result; # internal error -- can't happen. return $result; } sub list { my $x= shift; # delete leading zeros $x =~ s/^\0+//; my $counter= 0; return map { _mapone ($_, $counter++) } (unpack ("C*", $x)); }
      Good point about having an inverse function! The principle use is to compare two lists, not copy a list (I use the NATO alphabet for the latter). I figured a useful "inverse" would be heavy on the user interface. E.g. start typing the first few letters and it jumps to the right spot in a scrolling list.

      Interesting game re constructing sentences. I put in the zipcode at work, as a 4-byte long, and thought I got an error since the output started with "absurd...". The zipcode is "absurd cellulose assume". I put those into Google and eyeball one that's in the right order. So... find existing text that encodes the message, and then refer to it?!

      You also might like regional variations on the word lists, not just whole different languages. Here in Texas, some words have 3 syllables when in other parts of the country they only have 1 or 2.