WWW FAQ: How can I search through ALL websites?


How can I search through ALL websites?

Several people have written robots which create indexes of websites -- including sites which have not arranged to be mentioned in the newspapers and catalogs above. (Before writing your own robot, please read the entry in the authoring section regarding robots.)

Here are a few such automatic indexes you can search:

Alta Vista
(URL is <URL:http://www.altavista.digital.com> ) is probably the most powerful web searching facility at this time, with an exhaustive database and the capability to search USENET newsgroups as well as websites. The query language is also powerful.
Yahoo
(URL is <URL:http://www.yahoo.com/> ) is probably the most complete hierarchical, topical index of websites, and also features a sophisticated search facility.
Lycos
(URL is <URL:http://fuzine.mt.cs.cmu.edu/mlm/lycos-home.html> ) is another web-indexing robot, which includes the ability to submit the URLs of your own documents by hand, ensuring that they are available for searching.
WebCrawler
(URL is <URL:http://webcrawler.com.html> ) builds an impressively complete index; on the other hand, since it indexes the content of documents, it may find many links that aren't exactly what you had in mind. However, it does a good job of sorting the documents it finds according to how closely they match your search.
World Wide Web Worm
(URL is http://www.cs.colorado.edu/home/mcbryan/WWWW.html ) builds its index based on page titles and URL contents only. This is somewhat less inclusive, but pages it finds are more likely to be an exact match with your needs.
InfoSeek
<URL:http://www.infoseek.com/> is a commercial search service which also offers a free web search facility <URL:http://www2.infoseek.com>. You can specify phrases to locate, among other query operations, and InfoSeek's commercial service can search more than just web pages (newsgroups, for instance). InfoSeek's commercial service charges 10 cents per query and offers a free trial to new users. (Increasing load on the free search servers makes this sound better every day.)
OpenText
(URL is <URL:http://www.opentext.com> ) also offers a robust web searching facility.
You can read about other search robots and the principles behind them in the robots section.
World Wide Web FAQ