cdherold has asked for the wisdom of the Perl Monks concerning the following question:

monks,

Using a foreach loop works fine, but when I try to put in a nested "if" statement the program doesn't seem to work.

I'm trying to screen throught the contents of a webpage to determine if a company is listed. If it is listed, I want to pull out the information starting at the company name ($company) until the tag </tr>.

I need to determine if the company is listed before doing the extraction.

The basic code without determining if the company is listed or not before extracting works fine.

This works ...

$content = get($url); @companies = ("AOL Time Warner", "Genetech","Broadwing"); foreach $company (@companies){ $content =~ /$company(.*?)<\/tr>/gsmi; $new_coverage = $1; print "$new_coverage <p>" }
BUT when I try to put the if statement within the foreach loop, it doesn't seem to work.

This doesn't work ...

$content = get($url); @companies = ("AOL Time Warner", "Genetech","Broadwing"); foreach $company (@companies){ if ($content =~ /$company/gsmi){ $content =~ /$company(.*?)<\/tr>/gsmi; $new_coverage = $1; print "$new_coverage <p>"; } }

Any idea's on why this is happening?

thanks

Replies are listed 'Best First'.
Re: foreach loop with nested
by John M. Dlugosz (Monsignor) on Aug 02, 2001 at 02:23 UTC
    I think the /g modifier is messing you up. You never finish the iteration, so the next match against the same string will find the next match, which fails if there is only one. So the if line you added "eats" the match and then the next line sees no more matches available.

    try

    if ($content =~ /$company(.*?)<\/tr>/smi) { $new coverage= $1; #etc }
      john,

      that /g modifier was the problem ... thanks very much.

      chris

Re: foreach loop with nested
by japhy (Canon) on Aug 02, 2001 at 01:37 UTC
    Why are you using the /gmsi modifiers? Do you know what any of them do? I can't be sure from your code, but I think you can remove the /g and the /m safely.

    _____________________________________________________
    Jeff japhy Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: foreach loop with nested
by arturo (Vicar) on Aug 02, 2001 at 02:41 UTC

    In general, the construction

    $content =~ /foo(.*?)bar/; $new_coverage = $1;

    Is cumbersome and error-prone -- if that match doesn't occur, $1 won't contain what you expect it to contain. So do it in one step -- capture the match into a variable, and see if it's defined (this version will grab each match and display it):

    while (my ($new_coverage) = ($content =~ /$company(.*?)<\/tr>/ig ) ) { print "$new_coverage\n"; }

    update wrote that a little quickly. runrig pointed out, quite correctly, that this regex will produce an infinite loop. What will work is

    if (my @new_coverage = ($content =~ /$company(.*?)<\/tr>/isg) ) { print "$_\n" foreach @new_coverage; }

    props, of course, to runrig </update>

    I'm guessing that you're snagging a web page with news about these companies, *and* that there may be more than one content item per company (otherwise the /g modifier and the while loop don't make sense). If this is what you're doing, there might be a need to add the /s to that regex as well (because you may want . to match a newline).

    perl -e 'print "How sweet does a rose smell? "; chomp ($n = <STDIN>); +$rose = "smells sweet to degree $n"; *other_name = *rose; print "$oth +er_name\n"'
Re: foreach loop with nested
by runrig (Abbot) on Aug 02, 2001 at 03:55 UTC
    Another way (and I hope your using strict and warnings if this is part of a larger script) (:
    my @companies = ("AOL Time Warner", "Genetech","Broadwing"); my $companies = join('|',map(quotemeta, @companies)); $companies = qr/($companies)(.*?)/is; while ($content =~ /$companies/g) { print "Company: $1 new coverage: $2\n"; }