March 5th, 2007

The most common form of search engine is what's called an Inverted Index. The concept is very simple - you take a document and break it up into tokens (although as per my previous post that's not as simple as you'd think) then you give the document an id and then go to the inverted index and note that document id next to every token.

