in reply to search text file

i have written this code but it doesnt work.

#!/usr/bin/perl -w use strict; my $domain; open(DOMAINLIST,'<domainlist'); my @list=<DOMAINLIST>; my $i=0; my $count=0; open(RESULT,'<result'); while($i<scalar(@list)){ $domain=$list[$i]; chomp $domain; while(<RESULT>){ for (my $line=$_){ chomp $line; if ($line=~/$domain/) { ++$count; print $domain; }}} ++$i; } print "$count"; close RESULT;

Replies are listed 'Best First'.
Re^2: search text file
by Anonymous Monk on Jul 26, 2011 at 10:08 UTC
    • You have your loops reversed, should be
      while(<RESULT>){ ... # check against @list }
    • open can fail, so use autodie
    • you chomp $domain but forgot to chomp@list;
    • You should use eq in combination with lc to compare domains, or use quotemeta on $domain ( or the equivalent  /\Q$domain\E/) since regular expressions aren't simple strings, they're a mini-language

      thanks for your help,i changed my code but still it doesnt work and i have wrong result.it seems this code dosent search whole file for my each domain.

      #!/usr/bin/perl -w use strict; open(DOMAINLIST,'<domainlist') or die,$!; my @list=<DOMAINLIST>; chomp @list; open(RESULT,'<result') or die,$!; while(<RESULT>){ my $domain; my $i=0; my $count=0; for (my $line=$_){ chomp $line; while($i<scalar(@list)){ $domain=$list[$i]; chomp $domain; if (/\Q$domain\E/) { ++$count; } print "$domain\n"; print "$count\n"; ++$i; }} } close RESULT;

        Think about what you are doing. For every line (the outer loop) you count for every domain, if you find it in that line (the inner loop). You only have one counter ($count) that you reset in the outer loop. That means that your counter only counts occurences in one line (then it gets reset) and it counts all domains together (since it is only one counter

        You have two options:

        The not-really-good option is to reverse the order of the loops again (like you had it in the beginning) and search the result file for every domain one after another. The disadvantage of that method is that you have to reread that result file again and again (for each domain). If the result file is large your simple script will run for minutes or hours and put a lot of strain on your hard disk.

        The better option is to change your second script so that you have more than one counter. In perl this is usually done with hashes. Here the adapted script (I also corrected your use of die):

        #!/usr/bin/perl -w use strict; my %count; open(DOMAINLIST,'<domainlist') or die $!; my @list=<DOMAINLIST>; chomp @list; open(RESULT,'<result') or die $!; while(<RESULT>){ my $domain; my $i=0; for (my $line=$_){ chomp $line; while($i<scalar(@list)){ $domain=$list[$i]; chomp $domain; if (/\Q$domain\E/) { $count{$domain}++; } ++$i; }} } close RESULT; foreach my $domain (keys %count) { print "$domain\n"; print "$count{$domain}\n"; }

        Note that your script still has limitations. Every domain is only counted once per line of the result file.

        PS: You should indent your scripts. Makes them much more readable.

Re^2: search text file
by jwkrahn (Abbot) on Jul 26, 2011 at 11:15 UTC

    Try it like this:

    #!/usr/bin/perl use warnings; use strict; open DOMAINLIST, '<', 'domainlist' or die "Cannot open 'domainlist' be +cause: $!"; chomp( my @list = <DOMAINLIST> ); close DOMAINLIST; open RESULT, '<', 'result' or die "Cannot open 'result' because: $!"; while ( my $line = <RESULT> ) { for my $domain ( @list ) { ++$count while $line =~ /$domain/g; } } close RESULT; print "$count\n";

      thank you for your help.i solve it as follow

      #!/usr/bin/perl use warnings; use strict; open DOMAINLIST, '<', 'domainlist' or die "Cannot open 'domainlist' be +cause: $!"; chomp( my @list = <DOMAINLIST> ); close DOMAINLIST; open RESULT, '<', 'result' or die "Cannot open 'result' because: $!"; my $domain; #define a hash for count each domain. my %count; while ( my $line = <RESULT> ) { foreach $domain ( @list ) { if ($line =~ /(\Q$domain\E)/g){ $count{$1}++; } } } close RESULT; foreach $domain(keys %count){ print"$domain=$count{$domain}\n"; }

        If you use  if ($line =~ /(\Q$domain\E)/g){ you will only get one $domain per $line.    If you want all $domain per $line then you need to use a while loop:

        my %count; while ( my $line = <RESULT> ) { foreach my $domain ( @list ) { while ( $line =~ /(\Q$domain\E)/g ) { $count{ $1 }++; } } }

        Update: Or better yet, you don't really need capturing parentheses:

        my %count; while ( my $line = <RESULT> ) { foreach my $domain ( @list ) { while ( $line =~ /\Q$domain\E/g ) { $count{ $domain }++; } } }

      i want count of each domain in @list separately.thank you.