esoteric neonate has asked for the wisdom of the Perl Monks concerning the following question:

I need the two foreach loops to work together. In short what I'm trying to do is read the contents of a file, change the case to lowercase then as it reads each line from the file it removes all @ignore words from my list. So far there are two problems

  • I need @ignore to stay intact so it removes all ignored words for every line
  • s/$_//; is giving an uninitialized error in substitution
    foreach $line (@words) { $line =~ tr/A-Z/a-z/; foreach $line (@ignore) { s/$_//; } }
  • Replies are listed 'Best First'.
    Re: Nested foreach
    by broquaint (Abbot) on Apr 25, 2003 at 15:34 UTC
      With a regular expression and a well placed map your nested loop can be reduced to a single line
      ## build a regular expression to match all the words in @ignore my $ignore = '(?:\b'. join('\b|\b', map quotemeta, @ignore). '\b)'; ## assuming @words contains the contents of the file s/$ignore// for map lc, @words;
      Now it's just a matter of writing @words back to the original file. See. quotemeta, join, map and lc for info on the functions used in the above snippet, perlsyn for info on the for modifier and perlre for further details of the regex used.
      HTH

      _________
      broquaint

        my $ignore = '(?:\b'. join('\b|\b', map quotemeta, @ignore). '\b)';
        That's a lot of B's. (I was making a motorboat sound just thinking about that regex. {grin})

        Wouldn't this be simpler?

        my $ignore = '\b(?:' . join("|", map "\Q$_", @ignore) . ')\b';

        -- Randal L. Schwartz, Perl hacker
        Be sure to read my standard disclaimer if this is a reply.

          Wouldn't this be simpler?
          Er, yes, yes it would ;)

          mental note to self - refactor

          _________
          broquaint

    Re: Nested foreach
    by perlplexer (Hermit) on Apr 25, 2003 at 15:35 UTC
      @ignore should stay intact because you're not modifying it. There's a couple of other problems with the code though.
      - You're reusing $line
      - You're using $_ although it never gets initialized

      I think this is what you intended to do (untested)
      $_ = qr/\Q$_\E/ for @ignore; # precompile regexes foreach my $line (@words){ $line =~ tr/A-Z/a-z/; # or lc() foreach my $ignore (@ignore){ $line =~ s/$ignore//g; } }
      --perlplexer
        I'm still getting the same problem after I begun using a different variable instead of $line, why does it still refuse to ignore my words?
        my @words; my %search; my $first; my $second; my $line; my @ignore = qq(a and the this i me us our ok abc); my $file = "abc.txt"; my $count = "0"; open (FILE, $file) or die "Error $!"; @words = <FILE>; chomp(@words); close FILE; my @search= @words; foreach my $line (@words) { $line =~ tr/A-Z/a-z/; foreach my $ignore (@ignore) { $ignore =~ s/\b$ignore\b//; } } foreach my $line (@words) { # splitting words on a white space but allowing contractions and hyphe +ns while ($line =~ /([[:alpha:]]+(?:'[[:alpha:]]+)?)/g) { if (exists ($search{$1})) { $search{$1}++; } else { $search{$1}=""; $search{$1}++; } } } while (($first,$second)=each(%search)) { print "$first -- $second\n"; }
    Re: Nested foreach
    by queue (Beadle) on Apr 25, 2003 at 15:35 UTC
      You don't need to use $_, just use $line from the first foreach and use a different variable name for the second foreach:
      foreach $line (@words) {
        $line =~ tr/A-Z/a-z/;
      
        foreach $swear (@ignore) {
          $line =~ s/$swear//;
        }
      }
      

      But there is no need to do it nested. Just use two different loops. And I don't know about the relative merits of each, but if you want to lowercase a string, you can just use the Perl function lc.
    Re: Nested foreach
    by samurai (Monk) on Apr 25, 2003 at 15:39 UTC
      Lowercasing each line is simple. $line = lc $line; tr/// works fine, too.

      Put your ignore words in a hash, like so:

      my %ignores = map { $_ => 1 } qw/bar foo baz quux/;

      Then, as you read through the file break your $lines into $word's and:

      next if $ignores{$word};

      Hashes are fast and convenient. Use them.

      --
      perl: code of the samurai

        Lowercasing each line is simple. $line = lc $line; tr/// works fine, too.

        Functionally, yes. But, lc is self-commenting. tr is not. Use the self-commenting one.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

    Re: Nested foreach
    by dpuu (Chaplain) on Apr 25, 2003 at 15:43 UTC
      You want to use a different loop variable in your inner-loop; and then bind the substitution to $line:
      foreach my $line (@words) { $line =~ tr/A-Z/a-z/; foreach my $ignore (@ignore) { $line =~ s/\b$ignore\b//; } }
      I've also added \b assertions around the word that you're ignoring, to prevent strange modifications in the middle of words. --Dave