NewMonk2Perl has asked for the wisdom of the Perl Monks concerning the following question:

Good Afternoon PerlMonk gurus, I cannot seem to figure a way to extract all the data between the parenthesis and also all the data before the parenthesis. I then want to rearrange it so it prints it into a file like below:

#!/usr/bin/perl use warnings; use strict; open my $NEW_FAM, '>', 'C:/Scripts/TEST/History_FAM_Test.txt' or die "Could not open target file. $!"; my $line1 = "Positive for Depression ( mother ; sister ), Type 2 Di +abetes ( father ; mother ; grandparents ) and Anxiety ( mother) ." +; my $line2 = "Cancer ( mother, grandmother )"; push(my @test, $line1); push(@test, $line2); for my $test (@test) { if ($test =~ /\(/) { print "Has () ... \n"; $test =~ s/Positive\sfor\s//; my @var3 = $test =~ /\((.*)\)/g; # Extract and then separate into different rows based + on the family members inside the parenthesis #print into the file with pipes as separators } }

Below is how the output into the file should look like:

Output: mother || Depression sister || Depression father || Type 2 Diabetes mother || Type 2 Diabetes grandparents || Type 2 Diabetes mother || Anxiety #Notice the 'and' is stripped out mother || Cancer grandmother || Cancer

Thanks in advance!!!

Replies are listed 'Best First'.
Re: Extracting multiple match and reorganizing order with Regex
by jbuck (Novice) on Apr 19, 2016 at 00:01 UTC

    Hello,

    You should take advantage of the regex capture variables, $1, $2, etc.

    Try something like:

    foreach my $test (@test) { my ($condition,$persons) = ($2,$3) if($test =~ /(positive for )?(\w ++).*\((.*)\)/i); # $1 capture will either be "positive for" or empty $persons =~ s/\s+//g; # omit spaces $persons =~ s/,/;/g; # changes any commas to semi-colons to have a + common separator my @persons = split /;/, $persons; foreach my $person (@persons) { print "$person\t|| $condition\n"; } }

      Thanks for the reply! It looks like your code omits a person from the 1st parenthesis and skips the 2nd and 3rd parenthesis in the first list. I am assuming it the 3rd line?

Re: Extracting multiple match and reorganizing order with Regex
by stevieb (Canon) on Apr 19, 2016 at 00:56 UTC

    There are quite a few assumptions being made so it leads to excess code, especially since I don't have much time to really focus and golf it down, but give this a whirl:

    use warnings; use strict; my $line1 = "Positive for Depression ( mother ; sister ), Type 2 Di +abetes ( father ; mother ; grandparents ) and Anxiety ( mother) ." +; my $line2 = "Cancer ( mother, grandmother )"; my %issues; for my $line (($line1, $line2)){ $line =~ s/(?:Positive for | and )//g; $line =~ s/,/;/g; # remove a semi-colon, if it immediately follows a closing # parens. This ; is the comma after "; sister )" that we # turned into a semi-colon $line =~ s/(?<=\));//g; if (my @entries = $line =~ /.*?\(.*?\)/g){ for (@entries){ /(.*)\((.*)\)/; my $issue = $1; my $people = $2; $issue =~ s/^\s+//g; # squash leading whitespace $people =~ s/\s+//g; # squash all whitespace @{ $issues{$issue} } = split /;/, $people; } } } for my $issue (keys %issues){ for my $person (@{ $issues{$issue} }){ print "$person || $issue\n"; } } __END__ mother || Cancer grandmother || Cancer father || Type 2 Diabetes mother || Type 2 Diabetes grandparents || Type 2 Diabetes mother || Depression sister || Depression mother || Anxiety
Re: Extracting multiple match and reorganizing order with Regex
by brilant_blue (Beadle) on Apr 19, 2016 at 01:15 UTC

    Script:

    #!/usr/bin/env perl use strict; use warnings; use diagnostics; use feature 'say'; while (<DATA>) { chomp; if ($_ ne q{}) { s/Positive\sfor\s//; while (/ (?:and\s)? ((?:\w+\s)+) # Remember disease name \( ([^\)]+) # Remember list of family members \) /xg) { my ($disease, @persons) = ($1, map { s/^\s+//; s/\s+$//; $_; } split /;|,/, $2); $disease =~ s/\s$//; foreach my $person (@persons) { say "$person || $disease"; } } } } close DATA or die 'Could not close DATA: ', $!; __DATA__ Positive for Depression ( mother ; sister ), Type 2 Diabetes ( fat +her ; mother ; grandparents ) and Anxiety ( mother) . Cancer ( mother, grandmother )

    Output:

    mother || Depression sister || Depression father || Type 2 Diabetes mother || Type 2 Diabetes grandparents || Type 2 Diabetes mother || Anxiety mother || Cancer grandmother || Cancer

      Sweet, works perfectly!! What does the code below mean:

      q{}

      Is that q for quoting? If so what does the {} do? Thanks for everyones help!

        q{}, as mentioned in perlop, is a single-quoted string, in this case an empty one. It's the same as saying $_ ne '', and similar to $_ !~ /^$/.