Re: Nested foreach
by broquaint (Abbot) on Apr 25, 2003 at 15:34 UTC
|
## build a regular expression to match all the words in @ignore
my $ignore = '(?:\b'. join('\b|\b', map quotemeta, @ignore). '\b)';
## assuming @words contains the contents of the file
s/$ignore// for map lc, @words;
Now it's just a matter of writing @words back to the original file. See. quotemeta, join, map and lc for info on the functions used in the above snippet, perlsyn for info on the for modifier and perlre for further details of the regex used.
HTH
_________ broquaint | [reply] [d/l] |
|
|
my $ignore = '\b(?:' . join("|", map "\Q$_", @ignore) . ')\b';
-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Nested foreach
by perlplexer (Hermit) on Apr 25, 2003 at 15:35 UTC
|
@ignore should stay intact because you're not modifying it. There's a couple of other problems with the code though.
- You're reusing $line
- You're using $_ although it never gets initialized
I think this is what you intended to do (untested)
$_ = qr/\Q$_\E/ for @ignore; # precompile regexes
foreach my $line (@words){
$line =~ tr/A-Z/a-z/; # or lc()
foreach my $ignore (@ignore){
$line =~ s/$ignore//g;
}
}
--perlplexer | [reply] [d/l] |
|
|
I'm still getting the same problem after I begun using a different variable instead of $line, why does it still refuse to ignore my words?
my @words;
my %search;
my $first;
my $second;
my $line;
my @ignore = qq(a and the this i me us our ok abc);
my $file = "abc.txt";
my $count = "0";
open (FILE, $file) or die "Error $!";
@words = <FILE>;
chomp(@words);
close FILE;
my @search= @words;
foreach my $line (@words) {
$line =~ tr/A-Z/a-z/;
foreach my $ignore (@ignore) {
$ignore =~ s/\b$ignore\b//;
}
}
foreach my $line (@words) {
# splitting words on a white space but allowing contractions and hyphe
+ns
while ($line =~ /([[:alpha:]]+(?:'[[:alpha:]]+)?)/g) {
if (exists ($search{$1})) {
$search{$1}++;
} else {
$search{$1}="";
$search{$1}++;
}
}
}
while (($first,$second)=each(%search)) {
print "$first -- $second\n";
}
| [reply] [d/l] |
|
|
$ignore =~ s/\b$ignore\b//;
to $line =~ s/\b$ignore\b//g;
| [reply] [d/l] [select] |
|
|
|
|
|
Re: Nested foreach
by queue (Beadle) on Apr 25, 2003 at 15:35 UTC
|
You don't need to use $_, just use $line from the first foreach and use a different variable name for the second foreach:
foreach $line (@words) {
$line =~ tr/A-Z/a-z/;
foreach $swear (@ignore) {
$line =~ s/$swear//;
}
}
But there is no need to do it nested. Just use two different loops. And I don't know about the relative merits of each, but if you want to lowercase a string, you can just use the Perl function lc. | [reply] [d/l] [select] |
Re: Nested foreach
by samurai (Monk) on Apr 25, 2003 at 15:39 UTC
|
Lowercasing each line is simple. $line = lc $line; tr/// works fine, too.
Put your ignore words in a hash, like so:
my %ignores = map { $_ => 1 } qw/bar foo baz quux/;
Then, as you read through the file break your $lines into $word's and:
next if $ignores{$word};
Hashes are fast and convenient. Use them.
--
perl: code of the samurai | [reply] [d/l] [select] |
|
|
Lowercasing each line is simple. $line = lc $line; tr/// works fine, too.
Functionally, yes. But, lc is self-commenting. tr is not. Use the self-commenting one.
------ We are the carpenters and bricklayers of the Information Age. Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement. Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] |
Re: Nested foreach
by dpuu (Chaplain) on Apr 25, 2003 at 15:43 UTC
|
You want to use a different loop variable in your inner-loop; and then bind the substitution to $line:
foreach my $line (@words) {
$line =~ tr/A-Z/a-z/;
foreach my $ignore (@ignore) {
$line =~ s/\b$ignore\b//;
}
}
I've also added \b assertions around the word that you're ignoring, to prevent strange modifications in the middle of words.
--Dave | [reply] [d/l] |