sophix has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys,

I am trying to parse a text file contaning customer information of the following format:

AB1\tA{Daniel Wright}\sA{Jack Smith}\sB{Jane Goodwin} QW1\tA{Samantha Patton}\sC{Timothy Eeckles} AR2\tA{Jane Goodwin}

Each line may contain different number of customers and this has made it difficult for me to write a regex to capture all customers<\p>

I would like to put all the names into an array and then print them out along with the number of times a customer's name is mentioned in the file:

Jane Goodwin 2 Daniel Wright 1 Jack Smith 1 Samantha Patton 1 Timothy Eeckles 1

I used the following regex to extrac the names and put them into an array but it captures only the first customer on every line.

if ($line =~ m/{(.*?)}/g) { $customer_name = $1; push (@Customer_List, $customer_name);

How can I modify this regex to match more than one instance, i.e, multiple customer names?

Thank you

Replies are listed 'Best First'.
Re: Matching multiple {hits}
by roboticus (Chancellor) on Feb 13, 2011 at 19:13 UTC

    sophix:

    I've not tried it, but I think it's as easy as:

    my @names = ($line =~ m/{(.*?)}/g); if (@names) { push @Customer_list, @names; }

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      and that's the same as
      push @Customer_list, $line =~ m/{(.*?)}/g;
      Hi roboticus,

      It works very nicely, thank you very much. So I guess I should also have used an array when matching - not only when pushing the matched names to an array.

Re: Matching multiple {hits}
by CountZero (Bishop) on Feb 13, 2011 at 20:36 UTC
    In essence one line is all you need (once you have set-up all the variables and such):
    use Modern::Perl; use Data::Dumper; my $text= join '', <DATA>; my %customers; $customers{$_}++ for $text =~ m/{(.*?)}/g; say Dumper(\%customers); __DATA__ AB1\tA{Daniel Wright}\sA{Jack Smith}\sB{Jane Goodwin} QW1\tA{Samantha Patton}\sC{Timothy Eeckles} AR2\tA{Jane Goodwin}
    Output:
    $VAR1 = { 'Jane Goodwin' => 2, 'Daniel Wright' => 1, 'Timothy Eeckles' => 1, 'Jack Smith' => 1, 'Samantha Patton' => 1 };

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Matching multiple {hits}
by ikegami (Patriarch) on Feb 13, 2011 at 20:16 UTC
    «if (//g)» not only makes no sense conceptually ("Check if it matches, then check again to make sure"?), it's buggy. You shouldn't see that in your program.
Re: Matching multiple {hits}
by Anonyrnous Monk (Hermit) on Feb 13, 2011 at 19:20 UTC
    #!/usr/bin/perl -w use strict; my %customers; while (my $line = <DATA>) { while ($line =~ m/{(.*?)}/g) { my $customer_name = $1; $customers{ $customer_name }++; } } use Data::Dumper; print Dumper \%customers; __DATA__ AB1\tA{Daniel Wright}\sA{Jack Smith}\sB{Jane Goodwin} QW1\tA{Samantha Patton}\sC{Timothy Eeckles} AR2\tA{Jane Goodwin}
    $VAR1 = { 'Jane Goodwin' => 2, 'Daniel Wright' => 1, 'Timothy Eeckles' => 1, 'Jack Smith' => 1, 'Samantha Patton' => 1 };