in reply to Match first word in line fails
use strict; use warnings; my @badwords = qw/crud crap poo wanker/; while (<DATA>) { my $line = $_; my $plainline = $line; $plainline =~ s/["'.,!?:;\-()[\]{}|\\\/]/ /g; #replace all punctua +tion with a space my @sentence = split(/ /,$plainline); print join '|', @sentence,"\n"; foreach my $word (@sentence) { chomp($word); print "\tChecking *$word*\n"; my $whichword = 0; # to track which badword was found foreach my $badword (@badwords) { if (lc($word) eq $badword) { print "\t\tFound $badword\n"; #my $newword = replaceword($whichword); #get a cleaner + word to replace the naughty word #$line =~ s/($word)/$newword/i; } $whichword++; } } #$cleanbook .= $line; } __DATA__ "Wanker!" said I. "Crap," says Travis. This is horse poo. Poo, I tell you! "Poo upon all the crud and crap the wanker could see."
And the output of this is:
Maybe the error is in your replaceword sub?|Wanker|||said|I| | Checking ** Checking *Wanker* Found wanker Checking ** Checking ** Checking *said* Checking *I* Checking ** |Crap|||says|Travis| | Checking ** Checking *Crap* Found crap Checking ** Checking ** Checking *says* Checking *Travis* Checking ** This|is|horse|poo||Poo||I|tell|you| | Checking *This* Checking *is* Checking *horse* Checking *poo* Found poo Checking ** Checking *Poo* Found poo Checking ** Checking *I* Checking *tell* Checking *you* Checking ** |Poo|upon|all|the|crud|and|crap|the|wanker|could|see|| | Checking ** Checking *Poo* Found poo Checking *upon* Checking *all* Checking *the* Checking *crud* Found crud Checking *and* Checking *crap* Found crap Checking *the* Checking *wanker* Found wanker Checking *could* Checking *see* Checking ** Checking **
As you see, you have lots of empty items in your wordslist since you replace punctuation by spaces and then split on spaces, effectively dropping all spaces but making a lot of empty "words".
CountZero
"If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Match first word in line fails
by Qiang (Friar) on Jan 31, 2005 at 02:58 UTC |