If I understand your goal and data, here's a method that uses two hashes -- one to keep track of how many different "Obj1" things there are, and the other to keep track of the distribution of "Obj2" things. It handles all three categories of "Obj2" things (common to both Obj1 things, unique to each), but it requires that the input be pre-sorted:
use strict; my %obj1found; my %obj2found; while (<DATA>) { if ( / HIT \s+ (\S+ \s+ [\d.]+) \s+ (\S+ \s+ [\d.]+) /x ) { my ( $obj1, $obj2 ) = ( $1, $2 ); $obj1found{ $obj1 }++; $obj2found{ $obj2 } .= " $obj1 " unless ( $obj2found{ $obj2 } =~ / \Q$obj1\E / ); } } my $match_all = join( ' ', sort keys %obj1found ); # note: two spaces between elements print join( "\t\n", "\nList of Obj2 things found in all Obj1's:", grep { $obj2found{$_} =~ /\Q$match_all\E/ } sort keys %obj2found ), "\n"; for my $obj1 ( sort keys %obj1found ) { print join( "\t\n", "\nList of Obj2 things found only in $obj1:", grep { $obj2found{$_} =~ /^ \Q$obj1\E $/ } sort keys %obj2found ), "\n"; } __DATA__ HIT object1 563.43.78 object3 123.89.7777 HIT object1 563.43.78 object10 123.89.7777 HIT object1 563.43.78 object2 453.78.122 HIT object1 563.43.78 object5 457.8888.1 HIT object1 563.43.78 object4 123.89.7777 HIT object1 563.43.78 object6 566.2222.11 HIT object2 563.43.78 object3 123.89.7777 HIT object2 563.43.78 object7 456.222.1111 HIT object2 563.43.78 object8 990.7777.66 HIT object2 563.43.78 object5 457.8888.1 HIT object2 563.43.78 object13 123.89.7777 HIT object2 563.43.78 object9 1223.333.111

This approach would generalize to any number of distinct "Obj1" things. If you have more than two, you might want to look at groupings other than "found in all Obj1 things" and "found only in a single Obj1 thing" -- that's "left as an exercise..."


In reply to Re: How to check for duplicate entries by graff
in thread How to check for duplicate entries by Angharad

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.