I have information about two sets of domains (known to be of a certain category, and suspected to be of that category). Each of the suspected ones has its information compared to the known ones. I'm checking registrant and administrative contact details: company name, contact name, address, phone number, and email.
If XYZ.com's registrant's address is the same as ABC.com's registrant's address, then $match{ABC}{XYZ}{address} |= 1. If XYZ's admin's address is the same as ABC.com's admin's address, then $match{ABC}{XYZ}{address} |= 2. So on and so forth. Thus, I have a hash that looks like:
%match = (
'ABC.com' => {
'XYZ.com' => {
company => 1, # reg. only
contact => 0, # no match
address => 3, # reg. and admin.
phone => 2, # admin. only
email => 2, # admin. only
},
...,
},
...
);
I want to come up with a logical metric that will allow me to sort through ABC.com's suspected linked domains in order of most likely connection to least likely. I don't want to sort merely by the number of non-zero shared info, but I don't want to multiply, and I'm not sure adding makes sense either. I'm not sure weighting comes into play; the fields have all been error checked, so there's no (truly) bogus data in them. (Some bogus data is there, but it's uniform, which is a good thing. Nevermind.)
Can someone enlighten me?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.