Let me describe a quick way of doing this...

0. assumption - you will never modify file 1, because you will only delete from file 2;

1. read the first file into a hash table, having each word as the hash key;

2. create a third file;

3. while scanning the second file, check the hash table built in step 1 for existance of the word;
if the word exists, do not print to the third file;
if the word does not exist, print the the third file;

4. replace file 2 with the third file.

#!/usr/bin/perl -w use strict; use IO::File; my %hash = (); my $f = IO::File->new("file1.txt", "r") or die "can not open file 1"; while (my $line = <$f>) { chomp $line; for my $word (split /\s*,\s*/, $line) { $hash{$word}++; } } my $f3 = IO::File->new("file3.txt", "w") or die "can not create file 3 +"; my $f2 = IO::File->new("file2.txt", "r") or die "can not open file 2"; while (my $line = <$f2>) { chomp $line; my @words = (); for my $word (split /\s*,\s*/, $line) { if (! exists $hash{$word}) { push @words, $word; } print $f3 join(",", @words), "\n"; } undef $f; undef $f2; undef $f3; # then replace file 2 with file 3...

In reply to Re: comparing and deleting some words from file by Roger
in thread comparing and deleting some words from file by perlbeginner10

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.