in reply to How do I read in a document, remove the stop words and then write the result to a new file?
The getStopWords function returns a hash reference, which we can dereference to get the hash keys (the stop words) for multiple s///g operations across your entire document to remove all found stop words:
use strict; use warnings; use Lingua::StopWords qw(getStopWords); { open my $infile, '<fulltext.txt' or die $!; my $fulltext = do { local $/; <$infile> }; $fulltext =~ s/ *?\b$_\b *?//gi for keys %{ getStopWords('en') }; open my $outfile, '>nostopwords.txt' or die $!; print $outfile $fulltext; }
Lexical variables (my) are used for file handles within the code block, so we don't need to explicitly close the opened files, since the files will automatically close when those variables fall out of scope.
Hope this helps!
|
|---|