japhy has asked for the wisdom of the Perl Monks concerning the following question:
If XYZ.com's registrant's address is the same as ABC.com's registrant's address, then $match{ABC}{XYZ}{address} |= 1. If XYZ's admin's address is the same as ABC.com's admin's address, then $match{ABC}{XYZ}{address} |= 2. So on and so forth. Thus, I have a hash that looks like:
I want to come up with a logical metric that will allow me to sort through ABC.com's suspected linked domains in order of most likely connection to least likely. I don't want to sort merely by the number of non-zero shared info, but I don't want to multiply, and I'm not sure adding makes sense either. I'm not sure weighting comes into play; the fields have all been error checked, so there's no (truly) bogus data in them. (Some bogus data is there, but it's uniform, which is a good thing. Nevermind.)%match = ( 'ABC.com' => { 'XYZ.com' => { company => 1, # reg. only contact => 0, # no match address => 3, # reg. and admin. phone => 2, # admin. only email => 2, # admin. only }, ..., }, ... );
Can someone enlighten me?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Metric for confidence of complex match
by samtregar (Abbot) on Oct 27, 2005 at 21:37 UTC | |
Re: Metric for confidence of complex match
by BrowserUk (Patriarch) on Oct 27, 2005 at 23:54 UTC | |
by BrowserUk (Patriarch) on Oct 28, 2005 at 05:34 UTC | |
Re: Metric for confidence of complex match
by GrandFather (Saint) on Oct 27, 2005 at 21:20 UTC | |
Re: Metric for confidence of complex match
by Util (Priest) on Oct 27, 2005 at 22:17 UTC | |
Re: Metric for confidence of complex match
by GrandFather (Saint) on Oct 27, 2005 at 21:50 UTC | |
Re: Metric for confidence of complex match
by sauoq (Abbot) on Oct 27, 2005 at 22:42 UTC | |
by japhy (Canon) on Oct 27, 2005 at 23:03 UTC |