Properties: id|A/V-pairs ----------------- 0|colour:red 1|colour:green 2|material:metal 3|colour:blue 4|material:wood 5|surface:roughAnd then our thing db:
things: bitmap|thing 012345|name ----------------- 101011|red-metal-wood-rough-Thing 001101|metal-blue-rough-Thing 01 |green-Thing
Questions:sub getBits { # lookup: colour:red -> is id/bitposition:0 # lookup: material:wood -> is id/bitposition:4 } my $bits = getBits('red-wood'); # $bits is 10001 my $nBits = unpack '%32b*', $bits; # http://docstore.mik.ua/orelly/per +l/prog/ch03_182.htm : "efficiently counts the number of set bits in a + bit vector" for my $straw ( @haystack ){ # loop over all records and compare my $similarity = unpack '%32b*', $straw & needle; # compute a delt +a print "Percentage similarity %f\n", $similarity / $nBits * 100; # +delta in relation to nbits benchmark ("distance") } # then, sort by similarity
In reply to Re^4: What is the best way to store and look-up a "similarity vector"?
by isync
in thread What is the best way to store and look-up a "similarity vector" (correlating, similar, high-dimensional vectors)?
by isync
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |