Ppeoc has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks! Please help this perl novice here. I have a sample csv file in the following format
Person1, Name = Lydia, Age = 20, Gender = F Person2, Name = Carol, Age = 54, Profession = Student, Gender = F, Hei +ght = 4'8 Person3, Name = Andy, Age = 37, Location = USA, Gender = M, Weight = 1 +17 Person4, Name = Nick, Age = 28, Gender = M
I need to parse the values of Name, Age and Gender of one particular person. (In reality, I have 20 such parameters to parse among other junk). These parameters are for each person are in no particular order. I am not sure how to compare @terms with @array to get values Carol, 54 and F. Any suggestions would really help me. Thank you fellow monks!
my @terms = qw(Name Age Gender); my $match = "Person2" while (<$file>) { chomp; if (/$match/){ @array= split(/,/); #What next??? } }

Replies are listed 'Best First'.
Re: Comparing an array with a regex array of strings?
by kcott (Archbishop) on Dec 17, 2015 at 06:36 UTC

    G'day Ppeoc,

    Whenever you're working with CSV data (or similar, e.g. tab-separated), reach for Text::CSV. (If you also have Text::CSV_XS installed, Text::CSV will run faster but, note, that XS module isn't a requirement for Text::CSV.)

    It looks like you're having to parse some fairly <insert expletive> data. Obviously, Name, Age, etc. would be better as headers (and not repeated with every value). I'll assume you've inherited this; however, if you created it this way, consider reformatting.

    The first thing I did was to create a regex from @terms. You can see how I did this programmatically (in the code below). That will work for your 20-odd parameters but, for the three you provided, will look like this:

    (Name|Age|Gender) \s+ = \s+ ( \S+ )

    When matched, the parameter will be in $1 and the value in $2.

    Then, using Text::CSV to parse the data, it was an easy task to check each array element for matches and print the results.

    Here's my test:

    #!/usr/bin/env perl -l use strict; use warnings; use Text::CSV; my @terms = qw{Name Age Gender}; my $re = '(' . join('|', @terms) . ') \s+ = \s+ ( \S+ )'; my $csv = Text::CSV->new(); while (my $row = $csv->getline(\*DATA)) { print "*** $row->[0] ***"; for (@$row) { print "$1: $2" while /$re/gx; } } __DATA__ Person1, Name = Lydia, Age = 20, Gender = F Person2, Name = Carol, Age = 54, Profession = Student, Gender = F, Hei +ght = 4'8 Person3, Name = Andy, Age = 37, Location = USA, Gender = M, Weight = 1 +17 Person4, Name = Nick, Age = 28, Gender = M

    Output:

    *** Person1 *** Name: Lydia Age: 20 Gender: F *** Person2 *** Name: Carol Age: 54 Gender: F *** Person3 *** Name: Andy Age: 37 Gender: M *** Person4 *** Name: Nick Age: 28 Gender: M

    Note that this works with the sample data you posted. If it isn't representative of the real data, you may need to make changes (probably just to the regex) to what I have here.

    — Ken

Re: Comparing an array with a regex array of strings?
by NetWallah (Canon) on Dec 17, 2015 at 04:38 UTC
    You would normally store the result in a HASH.

    In this case, you could "split" each element of the array on " = ", to separate them.

    We can help you along after you try these suggestions,shopw us what you tried, and let us know how it works for you, and how you expect these result to be.

            "I can cast out either one of your demons, but not both of them." -- the XORcist

Re: Comparing an array with a regex array of strings?
by vinoth.ree (Monsignor) on Dec 17, 2015 at 05:43 UTC

    The Following code creates a hash with first column(Person*) as outer key and other values as inner hash, now you can easily go over the outer key and match the Person and get the values.

    use strict; use warnings; use Data::Dumper; my %temp=(); my %main_struc=(); my $match='Person2'; while (my $line = <DATA>) { chomp $line; my @fields = split ("," , $line,2); my @temp=split(',',$fields[1]); foreach my $data (@temp){ my($key,$value)=split('=',$data); $key =~ s/^\s+//; $key =~ s/\s+$//; $main_struc{$fields[0]}{$key}=$value } } print Dumper \%main_struc; print "@{$main_struc{$match}}{'Name','Age','Gender'}"

    All is well. I learn by answering your questions...

      There's no need to strip off leading and trailing spaces if you split using a regex with optional whitespace around your delimiters. I get around the need for a %temp hash by using cascading maps.

      use strict; use warnings; use Data::Dumper; open my $csvFH, q{<}, \ <<EOD or die $!; Person1, Name = Lydia, Age = 20, Gender = F Person2, Name = Carol, Age = 54, Profession = Student, Gender = F, Hei +ght = 4'8 Person3, Name = Andy, Age = 37, Location = USA, Gender = M, Weight = 1 +17 Person4, Name = Nick, Age = 28, Gender = M EOD my %people = map { $_->[ 0 ], { map { split m{\s*=\s*} } split m{\s*,\s*}, $_->[ 1 ] } } map { chomp; [ split m{\s*,\s*}, $_, 2 ] } <$csvFH>; print Data::Dumper->Dumpxs( [ \ %people ], [ qw{ *people } ] );

      I hope this is of interest.

      Cheers,

      JohnGG

      Thank you! This is a very nice way to do it
Re: Comparing an array with a regex array of strings? -- oneliner
by Discipulus (Canon) on Dec 17, 2015 at 09:29 UTC
    Follow wiser advice Ppeoc but be aware of the oneliner solution (pay attention to the win32 doublequotes)
    It is possible and simple hardcoding your search:
    perl -lanF", " -e "print join(',',map{$1if/(?:[Name|Age|Gender] = )(.+ +)/}@F) if shift @F eq 'Person1'" persons.txt Lydia,20,F
    But is also possible to parametrize your oneliner with a BEGIN block that pop @ARGV to get two parameters and leave in it the file to feed to perl -lan. A custom delimiter for autosplit is specified with F", ".
    perl -lanF", " -e "BEGIN{map{$_=pop @ARGV}$rx,$who}print join(',',map{ +$1 if /(?:[$rx] = )(.+)/}@F) if shift @F eq $who" persons.txt Person1 + "Name|Age|Gender" Lydia,20,F
    All theese jems and much more are shown in perlrun

    Hth and have fun!
    L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.