in reply to document clustering via link contexts
The approach that comes to mind would be doing some sort of LSI / vector space search on words in the surrounding text and relating the URLs using that. Maybe this perl.com article and the references it gives will be of help.
|
|---|