volcan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all. I'm a perl noob and having difficulty with a simple program to combine the textual phone lists.

I'm trying to combine multiple lists into one master list. Since different lists will have parentheses around the area code, or use dashes, etc. and be formatted differently, I'm stripping the phone numbers of those extraneous symbols and just ending with the client names and phone numbers, e.g., Wanda Smith (332)432-9887 should become Wanda Smith 3324329887.

Duplicates should show the name followed by the phone numbers separated by a comma.

After that I want to have a list of those clients who were on more than one list, and show their phone numbers.

My code is doing most of what I want (using a hash of a hash), but I'm having trouble making a list of those clients who are on more than one list. I want to print out those clients' names and phone numbers (separated by a comma).

Here's the 3 test files I'm using (separated by blank lines):

Wanda Smith (332)432-9887<br> Freddie Sonere 442-9089<br> Bob Wilson 332-454-9932<br><br>
Bob Cartwright (821)987-0089<br> Felix Harris 433.344.8711<br> Bob Wilson (888)821- 2248<br><br>
Sam Burr 455-0327<br> Gloria Simpson (821)544-9341<br> Wanda Smith 444-9721<br><br>
and here's my code so far for making the master list:

#combine client lists into one list #strip all extraneous symbols #! usr/bin/perl -w while (<>) { chomp($inp = $_); $client = $inp; $client =~ tr/0-9().-//d; $num = $inp; $num =~ tr/a-zA-Z().-//d; $num =~ tr/ //d; #make big list<br> $bigList{$client}{$num}++; } for $one (sort keys %bigList) { + print "$one ", join ', ', sort {$a <=> $b} keys %{$bigList {$one}}; + print "\n"; }
Thank you very much to anyone who can help me. :)

Replies are listed 'Best First'.
Re: Combining client phone lists
by GrandFather (Saint) on Apr 20, 2007 at 04:46 UTC

    I strongly recommend using strictures (use strict; use warnings;), although that is not what is biting you here. In fact your code performs as I would expect (although normalizing whitespace in names would be a good idea - 2 entries for Bob Wilson):

    use strict; use warnings; my %bigList; while (<DATA>) { chomp $_; next unless length; my $inp = $_; my $client = $inp; my $num = $inp; $client =~ tr/0-9().-//d; $num =~ tr/ a-zA-Z().-//d; #make big list $bigList{$client}{$num}++; } for my $one (sort keys %bigList) { print "$one ", join ', ', sort {$a <=> $b} keys %{$bigList{$one}}; print "\n"; } __DATA__ Wanda Smith (332)432-9887 Freddie Sonere 442-9089 Bob Wilson 332-454-9932 Bob Cartwright (821)987-0089 Felix Harris 433.344.8711 Bob Wilson (888)821- 2248 Sam Burr 455-0327 Gloria Simpson (821)544-9341 Wanda Smith 444-9721

    Prints:

    Bob Cartwright 8219870089 Bob Wilson 3324549932 Bob Wilson 8888212248 Felix Harris 4333448711 Freddie Sonere 4429089 Gloria Simpson 8215449341 Sam Burr 4550327 Wanda Smith 4449721, 3324329887

    DWIM is Perl's answer to Gödel
      Thanks for the tip about "normalizing whitespace." The input is supposed to be three different input files (to simulate three salesmen's phone lists) and not all part of one file.

      I probably didn't explain that well enough. When I run the code, I'm not getting Bob Wilson but once. I don't know why he'd show up twice in yours...perl sure is a tricky lang.

      Here's the output I'm getting:

      Bob Cartwright 8219870089 Bob Wilson 3324549932, 8888212248 Felix Harris 4333448711 Freddie Sonere 4429089 Gloria Simpson 8215449341 Sam Burr 4550327 Wanda Smith 4449721, 3324329887


      How do I spy on the input to recognize when a client name has been read with more than one phone number and output a list of those?

      For example, if I have Gloria Simpson showing up in one list w/ her home number and in another list with her cell number, how do I make a list of her (and others w/ multiple ph. numbers) like this:

      Gloria Simpson 888324543, 7338733 Someone Else 44409323, 4332111


      'Preciate your help.

        Ok, three files, Bob Wilson white space fixed and it still works as I'd expect and, as far as I can tell, in the fashion you desire. Where's the problem?

        use strict; use warnings; open TEMP, '>', 'temp1.txt'; print TEMP <<FILE; Wanda Smith (332)432-9887 Freddie Sonere 442-9089 Bob Wilson 332-454-9932 FILE close TEMP; open TEMP, '>', 'temp2.txt'; print TEMP <<FILE; Bob Cartwright (821)987-0089 Felix Harris 433.344.8711 Bob Wilson (888)821- 2248 FILE close TEMP; open TEMP, '>', 'temp3.txt'; print TEMP <<FILE; Sam Burr 455-0327 Gloria Simpson (821)544-9341 Wanda Smith 444-9721 FILE close TEMP; my %bigList; local @ARGV = qw(temp1.txt temp2.txt temp3.txt); while (<>) { chomp $_; next unless length; my $inp = $_; my $client = $inp; my $num = $inp; $client =~ tr/0-9().-//d; $client =~ s/\s*$//; $num =~ tr/ a-zA-Z().-//d; #make big list $bigList{$client}{$num}++; } for my $one (sort keys %bigList) { print "$one ", join ', ', sort {$a <=> $b} keys %{$bigList{$one}}; print "\n"; }

        Prints:

        Bob Cartwright 8219870089 Bob Wilson 3324549932, 8888212248 Felix Harris 4333448711 Freddie Sonere 4429089 Gloria Simpson 8215449341 Sam Burr 4550327 Wanda Smith 4449721, 3324329887

        DWIM is Perl's answer to Gödel
Re: Combining client phone lists
by Corion (Patriarch) on Apr 20, 2007 at 07:04 UTC
Re: Combining client phone lists
by Krambambuli (Curate) on Apr 20, 2007 at 08:53 UTC
    Just a side note: I've just checked that it is still a to-be-done-issue (from the docs:
    Future releases of the module will also provide patterns for the follo +wing: * email addresses * HTML/XML tags * more numerical matchers, * mail headers (including multiline ones), * more URLS * telephone numbers of various countries * currency (universal 3 letter format, Latin-1, currency names) * dates * binary formats (e.g. UUencoded, MIMEd)
    )
    but it's still worth when working on regexps to throw an eye on Regexp::Common. With a bit of luck, what you need is already done, at highest quality.