in reply to Comparing an array with a regex array of strings?

G'day Ppeoc,

Whenever you're working with CSV data (or similar, e.g. tab-separated), reach for Text::CSV. (If you also have Text::CSV_XS installed, Text::CSV will run faster but, note, that XS module isn't a requirement for Text::CSV.)

It looks like you're having to parse some fairly <insert expletive> data. Obviously, Name, Age, etc. would be better as headers (and not repeated with every value). I'll assume you've inherited this; however, if you created it this way, consider reformatting.

The first thing I did was to create a regex from @terms. You can see how I did this programmatically (in the code below). That will work for your 20-odd parameters but, for the three you provided, will look like this:

(Name|Age|Gender) \s+ = \s+ ( \S+ )

When matched, the parameter will be in $1 and the value in $2.

Then, using Text::CSV to parse the data, it was an easy task to check each array element for matches and print the results.

Here's my test:

#!/usr/bin/env perl -l use strict; use warnings; use Text::CSV; my @terms = qw{Name Age Gender}; my $re = '(' . join('|', @terms) . ') \s+ = \s+ ( \S+ )'; my $csv = Text::CSV->new(); while (my $row = $csv->getline(\*DATA)) { print "*** $row->[0] ***"; for (@$row) { print "$1: $2" while /$re/gx; } } __DATA__ Person1, Name = Lydia, Age = 20, Gender = F Person2, Name = Carol, Age = 54, Profession = Student, Gender = F, Hei +ght = 4'8 Person3, Name = Andy, Age = 37, Location = USA, Gender = M, Weight = 1 +17 Person4, Name = Nick, Age = 28, Gender = M

Output:

*** Person1 *** Name: Lydia Age: 20 Gender: F *** Person2 *** Name: Carol Age: 54 Gender: F *** Person3 *** Name: Andy Age: 37 Gender: M *** Person4 *** Name: Nick Age: 28 Gender: M

Note that this works with the sample data you posted. If it isn't representative of the real data, you may need to make changes (probably just to the regex) to what I have here.

— Ken