Hi Monks,
I am new to perl. I am doing a project in school, where I'm creating a simple search engine.I give a query(a string of words) and a list of files are searched and the best matched file name is displayed. The basic plan is:-
1)preprocess the files
2)Document clustering
3)create term document matrix
4)Search
I was able to write the pre-processing and clustering modules, but I have confusion regarding the term-document matrix. Should I create a separate array for each term, or should I use a 2-d array. And how do i search for terms from the array.(the document that contains maximum of the query terms is displayed)
And is there any better way to search than using a term-document matrix?
p.s. This is a pretty small project, so I don't need highly efficient search techniques, any easy ones would do.
Thank you
In reply to Term document matrix for search engine by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |