Re: search text file
by Anonymous Monk on Jul 26, 2011 at 09:59 UTC
|
| [reply] |
Re: search text file
by Anonymous Monk on Jul 26, 2011 at 09:52 UTC
|
#!/usr/bin/perl -w
use strict;
my $domain;
open(DOMAINLIST,'<domainlist');
my @list=<DOMAINLIST>;
my $i=0;
my $count=0;
open(RESULT,'<result');
while($i<scalar(@list)){
$domain=$list[$i];
chomp $domain;
while(<RESULT>){
for (my $line=$_){
chomp $line;
if ($line=~/$domain/) {
++$count;
print $domain;
}}}
++$i;
}
print "$count";
close RESULT;
| [reply] [d/l] |
|
|
| [reply] [d/l] [select] |
|
|
#!/usr/bin/perl -w
use strict;
open(DOMAINLIST,'<domainlist') or die,$!;
my @list=<DOMAINLIST>;
chomp @list;
open(RESULT,'<result') or die,$!;
while(<RESULT>){
my $domain;
my $i=0;
my $count=0;
for (my $line=$_){
chomp $line;
while($i<scalar(@list)){
$domain=$list[$i];
chomp $domain;
if (/\Q$domain\E/) {
++$count;
}
print "$domain\n";
print "$count\n";
++$i;
}}
}
close RESULT;
| [reply] [d/l] |
|
|
|
|
#!/usr/bin/perl
use warnings;
use strict;
open DOMAINLIST, '<', 'domainlist' or die "Cannot open 'domainlist' be
+cause: $!";
chomp( my @list = <DOMAINLIST> );
close DOMAINLIST;
open RESULT, '<', 'result' or die "Cannot open 'result' because: $!";
while ( my $line = <RESULT> ) {
for my $domain ( @list ) {
++$count while $line =~ /$domain/g;
}
}
close RESULT;
print "$count\n";
| [reply] [d/l] |
|
|
#!/usr/bin/perl
use warnings;
use strict;
open DOMAINLIST, '<', 'domainlist' or die "Cannot open 'domainlist' be
+cause: $!";
chomp( my @list = <DOMAINLIST> );
close DOMAINLIST;
open RESULT, '<', 'result' or die "Cannot open 'result' because: $!";
my $domain;
#define a hash for count each domain.
my %count;
while ( my $line = <RESULT> ) {
foreach $domain ( @list ) {
if ($line =~ /(\Q$domain\E)/g){
$count{$1}++;
}
}
}
close RESULT;
foreach $domain(keys %count){
print"$domain=$count{$domain}\n";
}
| [reply] [d/l] |
|
|
|
|
| [reply] |
Re: search text file
by ambrus (Abbot) on Jul 27, 2011 at 10:26 UTC
|
This shell command almost works, but not quite: it actually counts the number of lines each string matches, so if a string can occur more than once in a line you'll get a wrong answer.
( while read; do grep -cFe "$REPLY" secondfile; done ) < firstfile
| [reply] [d/l] |
|
|
grep -oF -f firstfile secondfile | sort | uniq -c
| [reply] [d/l] |
|
|
Ah, good idea using grep -o. That does indeed find multiple matches in a single line.
That, however, won't work correctly if some of the matches are overlapping. Eg. if the second file has abcdef and the first file has the two strings abcd and cdef, grep will only find the abde part. As a workaround, you could run grep once for each string in the first file. Thus, we get (I think)
( while read; do grep -oFe "$REPLY" secondfile | wc -l; done ) < first
+file
| [reply] [d/l] [select] |