"17 comes before 12 or just a typo? "

typo! they are always ordered.

"If true, the first thing I would do is write a short program to do a single pass over your 2e9 silly format files and output a single file formatted like so:"

yes it is already in some "nicely" formated style but for the purposes of visualization i used histograms (i thought it vould be easier to understand the problem).

"And got completely lost in the number of columns and ranges of values for each column..."

so each histogram can have up to 300 columns. let say each column label is a number, then 300 does not imply that columns are labeled from 1 to 300 but the label range is 1-8000. from those 8000 labels each histogram has at most 300 different labels. (# of different ways you can pick 300 out of 8000 at most) the size of the column does not have a maximum value.

i hope i clarified the problem a bit.

cheers


In reply to Re^2: Similarity searching by baxy77bax
in thread Similarity searching by baxy77bax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.