Re: document clustering via link contexts

The approach that comes to mind would be doing some sort of LSI / vector space search on words in the surrounding text and relating the URLs using that. Maybe this perl.com article and the references it gives will be of help.

Comment on Re: document clustering via link contexts