in reply to Re: objects and duplicates
in thread objects and duplicates

Wfsp and stiller, thank you both. It worked. I now use
for $record (@records){ $duplicates{$_}++ for $record->src; }

to store each sentence in a hash, with the number of times it appears. This is great.

I still need to change the $record->duplicate of each object to the number of times the sentence appears. Do I have to write a new loop for this, or can we do it at the same time we count the duplicate sentences?

I was thinking of something like this:

for $record (@records){ $duplicates{$_}++ for $record->src; $record->duplicate++; }

Thank you

Replies are listed 'Best First'.
Re^3: objects and duplicates
by pc88mxer (Vicar) on Apr 27, 2008 at 18:50 UTC
    Does $record->src return a string or a list? If it returns a string you'll want:
    my %count; for my $record (@records) { if ($count{$record->src}++) { $record->duplicate(1); # or however you set the duplicate flag } }
    Note: if a string is duplicated, the first Entry object with that src value will not have its duplicate flag set but all matching Entry objects will.

      Hello,

      $record->src does return a string. Your code works perfectly fine, thank you.

      I didn't express myself clearly: I said flag for $record->duplicate, but in fact it should hold the number of times the sentence is duplicate, so it's more a count than a flag.

      I still think it's possible to achieve this, but how?

      Thank you

        I didn't express myself clearly: I said flag for $record->duplicate, but in fact it should hold the number of times the sentence is duplicate, so it's more a count than a flag.
        In that case your object approach isn't modeling the problem correctly - storing the number of duplicates in the Entry object doesn't make sense. Your original approach is better, but I'd do it this way:
        my %count; for (@items) { $count{$_}++ } my @dups = grep { $count{$_} > 1 } @items;
        or
        my (%count, @dups); for (@items) { $count{$_}++ == 1 && push(@dups, $_) }
        In each case $count{$sentence} contains the number of occurrences of $sentence. Will this work for you?
Re^3: objects and duplicates
by stiller (Friar) on Apr 27, 2008 at 18:56 UTC
    Just jot down the pseudo code: e.g.:
    • read the file, each line into a hash, incrementing number of occurences of that sentence.
    • create an object from each sentence, for which I already know the number of occurences...
    • and so on
    hth