Some algorithms use a list of stopwords (which are words that are so frequent they will poison any database you use for catalogueing the webpage / searching). Typical words are like "the", "a", "who", ... Which I find to be very unfair if you are a fan of The Who!
CountZero
A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
In reply to Re: Parsing HTML question
by CountZero
in thread Parsing HTML question
by vit
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |