Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Comparing string to array elements

by R56 (Sexton)
on Jan 11, 2017 at 18:02 UTC ( [id://1179411]=perlquestion: print w/replies, xml ) Need Help??

R56 has asked for the wisdom of the Perl Monks concerning the following question:

Hey all! Hoping someone can help me solve a little problem.

I'm parsing a block of free text in a file, and I have a pre-loaded array of specific terms.

When I parse that block of text into a string, I'm looking to cross reference each word on that string with the terms I have on the array. If there's a match between the two, print it out.

if($text =~ /upperboundary(.+)lowerboundary/s){ if(grep {$_ eq $1} @terms){ print OUT "$1\t"; } }

This doesn't work, and I searched for a while on how to do it, to no avail. Can someone point me in a better direction? Thanks in advance!

Replies are listed 'Best First'.
Re: Comparing string to array elements
by LanX (Saint) on Jan 11, 2017 at 18:11 UTC
    do you have test data?

    the code looks good for me,

    try to show a minimal working (and intended) example reading text from DATA and having @terms populated.

    personally I would code it with something like

    $clause= join "|", @terms; print $1 while m/UP($clause)LB/g;

    (untested)

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

      Thank you for the reply Rolf! I'm guessing I should've put all the code instead of a little snippet, as it isn't very long.

      use strict; use warnings; use File::Slurp; my $countfile = 1; open (IN, 'C:\terms.txt'); my @terms = <IN>; close IN; open (OUT,'>>C:\result.txt'); while ($countfile < 10000) { my $text = read_file('C:\file' . $countfile . '.xml') or die "Can' +t open file!"; if($text =~ /upperboundary(.+?)lowerboundary/s){ if(grep {$_ eq $1} @terms) { print OUT "$1\t"; } } close IN; $countfile++; open (IN, 'C:\file' . $countfile . '.xml') or die "End of files!"; } close OUT;

      I can see the array populated if I print it, but there must be some other thing that's destroying the output at the middle...

        Ok, I was overlooking something. I was comparing a whole set of free text to an array of terms, no wonder why it didn't return anything.

        Can the solution be splitting that free text into an array, and then comparing that to the terms I already have, printing the matches?

        my @adv = split /\s+/, $1;

        How would I compare if each term in @adv matches one on @terms, and then print it if it exists?

Re: Comparing string to array elements
by LanX (Saint) on Jan 11, 2017 at 18:17 UTC
    tested
    @terms = qw/.A. .C./; my $clause= join "|", map quotemeta ,@terms; while (my $line = <DATA>){ print $1 while $line =~m/UP($clause)LB/g; } __DATA__ UP.A.LB UP.B.LB UP.C.LB

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

Re: Comparing string to array elements
by Lotus1 (Vicar) on Jan 11, 2017 at 21:52 UTC

    I noticed you used the s modifier in your match and that you seem to be trying to match across the whole file.

    This example might not be the most efficient way but is a start.

    use warnings; use strict; my @terms = qw( Maple Oak walnut pine juniper ); my $text; { #slurp the whole file local $/; $text = <DATA>; } if ( $text =~ /upperboundary(.+)lowerboundary/s ) { $text = $1; print $text, "\n\n"; } else { die "no match"; } foreach my $term (@terms) { if ( $text =~ /($term)/is ) { print "$1\n"; } } __DATA__ upperboundary The book Hiking the Red, A Complete Trail Guide To Kentucky's Red Rive +r Gorge made it a point to describe the habitat and the diverse species of tre +es that one is likely to encounter at the Red. According to the authors, the s +pecies of trees found in the Daniel Boone National Forest includes beech, sug +ar maples, white pines, hemlock, several types of oak, and hickory. These + trees provide habitat for an estimated 67 different species of reptiles and amphibians, 46 species of mammals, and 100 species of birds. lowerboundary

    Edit: The text is from http://www.redrivergorge.com/dbnf.html.

      Hi Lotus,

      I used the s modifier because i needed the .+ to match new lines too. Is that the way to go?

        Yes, I used the s modifier in both of the regex's in my example. I changed my node a little to make clear I was saying my example might not be the most efficient. I wasn't referring to the use of the s modifier.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1179411]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-24 12:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found