Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,i want to search strings in a file within another text file.in first file,i have just one string in each line as a list.and i want to find count of each string in second file.thank you.afshin

Replies are listed 'Best First'.
Re: search text file
by Anonymous Monk on Jul 26, 2011 at 09:59 UTC
Re: search text file
by Anonymous Monk on Jul 26, 2011 at 09:52 UTC

    i have written this code but it doesnt work.

    #!/usr/bin/perl -w use strict; my $domain; open(DOMAINLIST,'<domainlist'); my @list=<DOMAINLIST>; my $i=0; my $count=0; open(RESULT,'<result'); while($i<scalar(@list)){ $domain=$list[$i]; chomp $domain; while(<RESULT>){ for (my $line=$_){ chomp $line; if ($line=~/$domain/) { ++$count; print $domain; }}} ++$i; } print "$count"; close RESULT;
      • You have your loops reversed, should be
        while(<RESULT>){ ... # check against @list }
      • open can fail, so use autodie
      • you chomp $domain but forgot to chomp@list;
      • You should use eq in combination with lc to compare domains, or use quotemeta on $domain ( or the equivalent  /\Q$domain\E/) since regular expressions aren't simple strings, they're a mini-language

        thanks for your help,i changed my code but still it doesnt work and i have wrong result.it seems this code dosent search whole file for my each domain.

        #!/usr/bin/perl -w use strict; open(DOMAINLIST,'<domainlist') or die,$!; my @list=<DOMAINLIST>; chomp @list; open(RESULT,'<result') or die,$!; while(<RESULT>){ my $domain; my $i=0; my $count=0; for (my $line=$_){ chomp $line; while($i<scalar(@list)){ $domain=$list[$i]; chomp $domain; if (/\Q$domain\E/) { ++$count; } print "$domain\n"; print "$count\n"; ++$i; }} } close RESULT;

      Try it like this:

      #!/usr/bin/perl use warnings; use strict; open DOMAINLIST, '<', 'domainlist' or die "Cannot open 'domainlist' be +cause: $!"; chomp( my @list = <DOMAINLIST> ); close DOMAINLIST; open RESULT, '<', 'result' or die "Cannot open 'result' because: $!"; while ( my $line = <RESULT> ) { for my $domain ( @list ) { ++$count while $line =~ /$domain/g; } } close RESULT; print "$count\n";

        thank you for your help.i solve it as follow

        #!/usr/bin/perl use warnings; use strict; open DOMAINLIST, '<', 'domainlist' or die "Cannot open 'domainlist' be +cause: $!"; chomp( my @list = <DOMAINLIST> ); close DOMAINLIST; open RESULT, '<', 'result' or die "Cannot open 'result' because: $!"; my $domain; #define a hash for count each domain. my %count; while ( my $line = <RESULT> ) { foreach $domain ( @list ) { if ($line =~ /(\Q$domain\E)/g){ $count{$1}++; } } } close RESULT; foreach $domain(keys %count){ print"$domain=$count{$domain}\n"; }

        i want count of each domain in @list separately.thank you.

Re: search text file
by ambrus (Abbot) on Jul 27, 2011 at 10:26 UTC

    This shell command almost works, but not quite: it actually counts the number of lines each string matches, so if a string can occur more than once in a line you'll get a wrong answer.

    ( while read; do grep -cFe "$REPLY" secondfile; done ) < firstfile

      To count all the matches from the command line:

      grep -oF -f firstfile secondfile | sort | uniq -c

        Ah, good idea using grep -o. That does indeed find multiple matches in a single line.

        That, however, won't work correctly if some of the matches are overlapping. Eg. if the second file has abcdef and the first file has the two strings abcd and cdef, grep will only find the abde part. As a workaround, you could run grep once for each string in the first file. Thus, we get (I think)

        ( while read; do grep -oFe "$REPLY" secondfile | wc -l; done ) < first +file