« Namespace Routing Language | Main | Kendall Clark reviews ws-arch »

Search Engine engine resources

ongoing · On Search: Basic Basics

If you're interested in search and retrieval, here are some resources.

Managing Gigabytes: best and most comprehensive of the bunch, should explain to you why XML is bad choice for storing search details in a centralized database (maybe a good choice for parallel search across the web).

Finding Out About: the natural successor to Salton, start here.

Online book: Information Retrieval: old but covers all the basics.

The authors of Managing Gigabytes built a production quality index and retrieval tool that you can use. Other stuff, that works and is usable:

Lucene: one classy piece of software

Lupy: a port of Lucence to Python, promising but incomplete.

JXTA Search: where the late Gene Kan's distributed search engine ended up.

June 21, 2003 09:40 AM


Trackback Pings

TrackBack URL for this entry: