You haven't posted any code, so we cannot help you by critiquing. However, it sounds like you're having pre-coding issues - namely, design.

The basic algorithm you're looking for is:

  1. Get a list of the things you want to compare.
  2. Iterate through that list, one at a time.
  3. Within that loop, iterate through the list again. Make sure to skip the one you picked in the outer loop.
  4. Do your compare.

my @list_of_stuff = get_my_list(); foreach my $i (0 .. $#list_of_stuff) { foreach my $j (0 .. $#list_of_stuff) { next if $i == $j; do_compare($list_of_stuff[$i], $list_of_stuff[$j]); } }

This algorithm will be very slow, especially if you're comparing more than 15-20 things. Remember, you're doing N * (N - 1) comparisons. So:

Things Comparisons 2 1 3 6 4 12 5 20 10 90 15 210 20 380 25 600 50 2450 75 5550 100 9900

It might be useful to do this kind of comparison on subsets of your data, then look at comparing typical items from each subset. So, if you could break 100 items down into 10 subsets of 10, then compare the typical item from each subset with each other, you reduce 9900 comparisons to 990. That's a 90% savings in time - both for the computer and for you as the user. (Remember, you are the one that has to deal with these comparisons.)

Of course, using another program to wade through the comparisons and discard the uninteresting ones can also be handy. I've done that many times. Where I work right now, we have a process that generates a set of logs. I have several scripts that do analysis on those logs and double-check the process's work. I even have a script that analyzes the results of the log analyzers. :-)

As to your second question - depending on the size of the things you're working with, you might not have enough memory to read everything in. Often, keeping those things on disk and reading them in when you want to deal with them is the appropriate thing to do. You might have to read things in over and over, but that's ok.

------
We are the carpenters and bricklayers of the Information Age.

The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.


In reply to Re: looking at everything by dragonchild
in thread looking at everything by usless_blonde

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.