I hadn't been able to figure out why the shuffling helped you, either. Before Eily's post, I was going to lean down the road of suggesting comparing your monotonically increasing ramp of weights with a monotonically decreasing ramp of weights, and see whether collisions were affected. And then recommend trying a list of weights which was half ramp + half random (actually, three sets of half-ramp: #1:first half increasing, #2:middle half increasing (quarter random on each side), #3: final half increasing. (And compare all those to one of your fully-shuffled weight setes.) If you saw that the down-ramp and the the half-ramps all had worse collisions than your shuffle, then I would suggest trying a mutator that tried to get rid of rampy-segments: if you found a sequence of weights that were strictly increasing (or generally increasing, with occasional excursions), I would suggest having the mutator pick new random primes or just re-shuffle the rampy section.

Also, I have found that histograms can hide some critical information. With your ordered set of 736000 combinations from the first example, you might want to comapre the time-series of the generated signatures -- probably with the multiple weight lists: ramp up, ramp down, 3 half ramps, and one or two completely random weights, to see if any patterns jump out and give you a hint for what your mutator could do.

Finally, as a last thought: as Eily said, sums of primes don't approach uniqueness. However, products of primes do. I know that purely multiplying primes is unique. I spent some time while I couldn't sleep last night trying to come up with a reasonable way to use that, since you'd obviously quickly overrun your 64bit integer with a product of K=1000 primes. :-) I think I came up with something (after looking up a few primey facts today) that would be a product of two primes.


In reply to Re^3: An optimal solution to finding weights for a weighted sum signature. by pryrt
in thread An optimal solution to finding weights for a weighted sum signature. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.