in reply to Problem in counting the occurrences of a string in a text file

Hi,
It would help if you presented less data and your examples reflected the data you are using.

Some initial thoughts:
1) The while loop around the Regex will be infinite if a match is found. I think that should read
$conto++ if $value =~/(($key\/$key\/S)\s{0,2}(\.\*)\s{0,2}(con\/con\/E)\s{0,2}(\.\*)\s{0,2}($value\/$value\/S))/is){
2) The while, open & close statements can be improved see below 3) You are inconsistent with your die and close statements, perhaps because you haven't got round to tidying them up yet
4) The con\/con\/E part of the Regex probably needs to be in a variable so you can loop through the other possibilities e.g. "dalla/da/E"
5) As your Regex is ignoring case the e.g. Nnoption is redundant.

The following code seems to work and incorporates some of the above points. I've also simplified the Regex as the example data works with this Regex.
#!/usr/bin/perl use strict; use warnings; open( INPUT, "<Wiki_Pulito/Prova/Pattern2.txt") or die "Can't open Pat +tern2.txt"; open( LISTAPAROLE,"<File_Input/Coppie_Parole.txt") or die "Can't open +Coppie_Parole.txt"; my %hash; while (<INPUT>) { chomp; my ($word1, $word2) = split /:/, $_; $hash{$word1} = $word2; } close INPUT; # Carico la parte di file di testo che va analizzata open( CONTEGGIO, ">Wiki_Pulito/Prova/Conteggio.txt") or die "Can't ope +n Conteggio.txt"; # Apro il file di output my $conto=0; my %arrayris; while (my $text = <LISTAPAROLE>){ for my $key (keys %hash){ my $value = $hash{$key}; if ($text =~/$key\/$key\/.*con\/con\/E.*$value\/$value +\/S/is){ $conto++; } my $arrkey=$key."-".$value; $arrayris{$arrkey}=$conto; } } while ( my ($k,$v) = each %arrayris ) { print CONTEGGIO "($k) => $v\n"; } close LISTAPAROLE; close CONTEGGIO;

Replies are listed 'Best First'.
Re^2: Problem in counting the occurrences of a string in a text file
by u671296 (Sexton) on Dec 29, 2008 at 16:42 UTC
    OK, a typo in the previous reply.

    $conto++ if $value =~/(($key\/$key\/S)\s{0,2}(\.\*)\s{0,2}(con\/con\/E +)\s{0,2}(\.\*)\s{0,2}($value\/$value\/S))/is){

    should read

    $conto++ if $text =~/(($key\/$key\/S)\s{0,2}(\.\*)\s{0,2}(con\/con\/E) +\s{0,2}(\.\*)\s{0,2}($value\/$value\/S))/is){

Re^2: Problem in counting the occurrences of a string in a text file
by findtheriver (Initiate) on Dec 29, 2008 at 21:20 UTC
    Thanks for the comment.
    But if I try using your code, I get the same result as mine. Every couple has the same number of occurrences, and that just isn't possible. I'm sorry if I cannot make it more clear, is just I don't know how to explain that.

      I think your problem is, that you only have one counter variable, which is increased for each individual matching.

      Consider something like this:

      #my $conto = 0; #### REMOVED; not needed my %arrayris; while (my $text = <LISTAPAROLE>){ for my $key (keys %hash){ my $value = $hash{$key}; if ($text =~/$key\/$key\/.*con\/con\/E.*$value\/$value\/S/is){ ### increase for each key/value pair individually $arrayris{ join '-', $key, $value }++; } } } while ( my ($k,$v) = each %arrayris ) { print CONTEGGIO "($k) => $v\n"; }