perlbeginner10 has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys, I am beginner in Perl. Please help me with this problem: I want to compare two (similar) files, and if some words are same in both the files, delete from one of them. For example: File 1 contains: cancer, lung cancer, heart, abdomen, stomach... and File 2 contains: Jim, John, abdomen, Jack... I want to delete abdomen from File 2, and keep scanning for other words in the files. Thanks.
  • Comment on comparing and deleting some words from file

Replies are listed 'Best First'.
Re: comparing and deleting some words from file
by Roger (Parson) on Nov 09, 2005 at 07:42 UTC
    Let me describe a quick way of doing this...

    0. assumption - you will never modify file 1, because you will only delete from file 2;

    1. read the first file into a hash table, having each word as the hash key;

    2. create a third file;

    3. while scanning the second file, check the hash table built in step 1 for existance of the word;
    if the word exists, do not print to the third file;
    if the word does not exist, print the the third file;

    4. replace file 2 with the third file.

    #!/usr/bin/perl -w use strict; use IO::File; my %hash = (); my $f = IO::File->new("file1.txt", "r") or die "can not open file 1"; while (my $line = <$f>) { chomp $line; for my $word (split /\s*,\s*/, $line) { $hash{$word}++; } } my $f3 = IO::File->new("file3.txt", "w") or die "can not create file 3 +"; my $f2 = IO::File->new("file2.txt", "r") or die "can not open file 2"; while (my $line = <$f2>) { chomp $line; my @words = (); for my $word (split /\s*,\s*/, $line) { if (! exists $hash{$word}) { push @words, $word; } print $f3 join(",", @words), "\n"; } undef $f; undef $f2; undef $f3; # then replace file 2 with file 3...
Re: comparing and deleting some words from file
by Amar (Sexton) on Nov 09, 2005 at 10:26 UTC
    hi,
    Assumption: Each word is seperated by comma only

    The code below is basic as the seeker of this perl question is a beginner
    #!c:/perl/bin/perl.exe use strict; my ($file1, $file2, $file1_contents, $file2_contents,@file1_words, @fi +le2_words, $word, $file2_new_contents); $file1="C:/file1.txt"; $file2="C:/file2.txt"; open(FH1, "<$file1") || die "$!\n"; open(FH2, "<$file2") || die "$!\n"; $file1_contents .= $_ while(<FH1>); $file1_contents =~ s/\s+//g; @file1_words = split(/,/,$file1_contents); $file2_contents .= $_ while(<FH2>); @file2_words = split(/,/,$file2_contents); close(FH1); close(FH2); foreach $word (@file1_words) { @file2_words = grep {!/^\s*$word\s*$/g} @file2_words; } $file2_new_contents .= $_."," foreach(@file2_words); open(FH2, ">$file2") || die "$!\n"; print FH2 $file2_new_contents; close(FH2);

    Hope it is useful
    amar