oko1 has asked for the wisdom of the Perl Monks concerning the following question:

Greeting, oh Holy Brethren in Perl! I've just run into a challenge, and I'm looking for a bit of... let's call it context, since I don't even know whether I'm making a mountain out of a molehill or if I'm underestimating the scope of the thing.

Problem: I've often thought of writing a script that would take a phone number and show the words in it (at least 3 letters in length) - no particular reason, just for the sake of playing around with it. Today, I had a bit of time to spare and dove into it... only to be brought to a screeching halt. Seems the thing isn't quite as simple as it appears at first glance; there are a few rules that need to be followed, plus some visual perception-type stuff and "what makes sense" type of stuff, too.

1) 0s and 1s don't count - they have no letter values - but
they do define where the words start/end. (Optionally, you 
could have a policy of using them in a "leet-speak" manner 
and letting them serve as 'o's and 'i's respectively.)

2) Sure, you could just slam all the digits in a contiguous 
(meaning, no 0s or 1s) number together and convert them to 
/[abc][def][ghi][jkl][mno][pqrs][tuv]/ for '234-5678',
which you would then match against a dictionary - but then, 
how do you get 'film' out of that (valid match for 3456)? 
You can't do /[abc]?[def]?[ghi]?[jkl]?/, etc., because that 
would match, e.g. 'ail' (245) - which isn't a valid combo.

3) Building an iteratively-exhaustive set of regexes to 
cover all valid positional combinations - e.g. 234-5670 
would mean looking at the character combinations for

234567|23456|34567|2345|3456|4567|234|345|456|567

- seems like a really hacky, ugly approach (I mean, we're 
*programming*, right? Supposed to let the computer do this 
kind of work for us and all that?)

Given all of the above, I've been trying to figure out an approach that makes sense and has some readable structure to it... and I've been failing miserably. When I got to the point of actually considering how to build approach #3, I gave up and decided to ask the help of my fellow coders.

Thanks in advance for any help offered.


--
"Language shapes the way we think, and determines what we can think about."
-- B. L. Whorf

Replies are listed 'Best First'.
Re: Phone number to word conversion
by JavaFan (Canon) on Nov 12, 2010 at 00:02 UTC
    use 5.010; use strict; use warnings; my @map = qw [0 1 abc def ghi jkl mno pqrs tuv xyz]; my $words = `cat /usr/share/dict/words`; while (<DATA>) { chomp; s/[^0-9]+//g; s/[01]+/ /g; foreach my $_ (split) { my @c; for (my $i = 0; $i < length; $i++) { for (my $j = 2; $i + $j <= length; $j++) { push @c, substr $_, $i, $j; } } my $pat = join '|', map {my $_ = $_; s/(.)/[$map[$1]]/g; $_} @ +c; say for $words =~ /^($pat)$/mg; } } __DATA__ 234-5678

      Sweet!!! I just love this:

      my @c; for (my $i = 0; $i < length; $i++) { for (my $j = 2; $i + $j <= length; $j++) { push @c, substr $_, $i, $j; } }

      *Very* pretty - thank you so much! Exactly the kind of thing I was asking for.

      Incidentally, Perl complains about your "abuse" of $_:

      Attempt to free unreferenced scalar: SV 0x9730848, Perl interpreter: 0 +x970a008 at /tmp/perm2 line 17, <DATA> line 1.

      but that's easily fixed. Again, thank you - that's a really nifty way to build that permutation list!


      --
      "Language shapes the way we think, and determines what we can think about."
      -- B. L. Whorf
        Incidentally, Perl complains about your "abuse" of $_:
        Not if you upgrade away from your old 5.10.0. Even an upgrade to 5.10.1 will fix that issue for you.
Re: Phone number to word conversion
by Anonymous Monk on Nov 11, 2010 at 23:19 UTC
    What I would do is build permutations, and test them against a dictionary I would improve this code by excluding special phone numbers and numbers which can't form English words.

    For example 555 or 123 (any permutation of 123) cannot comprise English words.

    7672676 aka popcorn used to be a special number (at the beep, the time will be 2:30 pm, PST, beeep) and it was 767 followed by any four numbers, not just corn.

      So it seems I was groping in the right direction, then - permutation is a reasonable approach. Awesome. Thanks so much for the help!


      --
      "Language shapes the way we think, and determines what we can think about."
      -- B. L. Whorf
      "123 can't comprise an English word" ??

      'beg' is a counter example.

        'beg' maps to '234', not '123'.
        I believe the OP is referring to the words that can be spelled out from a telephone keypad or dial (if those still exist), as used by sending SMS text messages etc, or in the more memorable telephone number/word combinations given out in commercials.

        So "beg" would be 234.

      7672676 aka popcorn used to be a special number (at the beep, the time will be 2:30 pm, PST, beeep) and it was 767 followed by any four numbers, not just corn.
      According to Wikipedia, that was the exchange used for the speaking clock in Northern California. Other parts of the US used different exchanges.
Re: Phone number to word conversion
by aquarium (Curate) on Nov 12, 2010 at 02:43 UTC
    did this (for fun) a long time ago when was working with a telco. it was a unix/linux one liner in shell...without reproducing it: map each digit to a regex for that position that is the set of corresponding letters, e.g. abc if digit is 2 or whatever. then grep the entire regex against /etc/words (on linux and most unixes) example 96753 can become the word WORLD
    grep "^[wxyz][mno][pqrs][jkl][def]$" /etc/words
    obviously you could do the same thing in perl.
    the hardest line to type correctly is: stty erase ^H

      That doesn't work unless the entire string matches a word. So, if you had a number (to use your example) of 967-5355, "world" would be a valid substring within it - but your "grep" line, above, wouldn't show it. Which was the whole reason for my initial post. :)


      --
      "Language shapes the way we think, and determines what we can think about."
      -- B. L. Whorf
        if you also want substring matching..merely remove the start and end anchors in the regex. if you instead want all permutations of the number (digits in any order) mapping onto words, it's a little more complicated.
        the hardest line to type correctly is: stty erase ^H
Re: Phone number to word conversion
by Limbic~Region (Chancellor) on Nov 12, 2010 at 21:43 UTC

      Limbic~Region: Thanks for the fun links! I hadn't realized that this kind of challenges were being posted, or I'd have given them a shot myself. I used to participate in the (sadly defunct) "Perl Quiz of the Week" list that Mark-Jason Dominus ran at one time, and really enjoyed it.

      Please PM me whenever you do get around to posting a solution; I'd be very interested in seeing it.


      --
      "Language shapes the way we think, and determines what we can think about."
      -- B. L. Whorf