It is inefficient to re-write your entire target file once for each "drop word". Luckily, there is a better algorithm; read your 'drop-words' into a hash, use the hash as a lookup table, then run through the words in your 'temp.txt' file one time. Every time you find that a word in the 'temp.txt' file exists within your hash, drop the line and move onto the next. Any line where you don't come across a drop-word, print the line to a new file.

use strict; use warnings; use autodie; use List::MoreUtils qw( any ); my %drop_words; open my $words_ifh, '<', 'words.txt'; while( <$words_ifh> ) { $drop_words{ ( split /\s+/, $_, 2 )[0] } = 1; } close $words_ifh; open my $temp_ifh, '<', 'temp.txt'; open my $result_ofh, '>', 'temp_mod.txt'; while( <$temp_ifh> ) { chomp; next if any { exists $drop_words{$_} } split /\s+/; print {$result_ofh} $_, "\n"; } close $temp_ifh; close $result_ofh;

If you're not interested in using the non-core module List::MoreUtils, you could achieve about the same goal by changing line 21 to look like this:

next if defined first { exists $drop_words{$_} } split /\s+/;

...and replacing line 4 with use List::Util qw(first); (a core module).


Dave


In reply to Re: how to avoid opening and closing files by davido
in thread how to avoid opening and closing files by sagar_qwerty

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.