Forgot your password?  

Not What You Meant?  There are 32 definitions for Google.  Also try: Hit or Engine or Searching or Search optimization.

Search Engines | Research & Encyclopedia Articles

Print-Friendly   Order the PDF version   Order the RTF version
About 5 pages (1,607 words)
Search engine Summary

Purchase our Search Engines - Table of Contents


Search Engines

A search engine is an information retrieval system that allows someone to search the vast collection of resources on the Internet and the World Wide Web. All major search engines are similar in that keywords, phrases, or in some instances, questions, are entered in a search form. After clicking on the search command button, the database returns a collection of hyperlinks to resources that contain the search terms. These hyperlinks are listed in some sort of order, usually from most relevant to least relevant, or by how important the web pages are, depending on the search engine used. Search engines are composed of computer programs that create databases automatically. They should not be confused with human-built directories, such as Yahoo!, which depend on people for development and maintenance.

Search Engine Basics

Search engines have three components. The first part is a computer program called a spider or robot, which gathers information on the Internet. The spider retrieves hyperlinks attached to documents. It starts with an existing database and follows the existing hyperlinks to gather new and updated resources to add to the list. If a web page does not contain hyperlinks to other web pages, the search engine cannot find it. Other types of resources that most spiders are unable to locate include files that are not written in Hypertext Markup Language (HTML), and from specialized databases that require the user to fill out a search form. Spiders automatically do this gathering of documents at intervals that differ from service to service.

Second, resources collected by the spider are loaded into a database that indexes them using a formula that is unique to each. The index contains a copy of every web page the spider finds. People can also submit web pages to this database in case the spider either fails to access it quickly enough, or if there are no links on the pages. While most search engines claim to index the entire World Wide Web, none actually do. Although spiders have many different ways of collecting information from web pages, the major search engines all claim to index the entire text of each web document in their databases. This is called full-text indexing. Some search engines may not index common words such as: and, a, I, to. These are called stop words.

The third part of the search engine is software that allows users to enter keywords in search forms using some type of search expression, with syntax that is supported by the search engine. The search results are then listed in order according to a ranking algorithm. Some search engines list results by relevancy, while others list them by how many web pages link to them, thereby showing the most important, or popular, web pages first, and others group results together by subject. Many search engines employ a combination of these.

Search Features

It is important to understand the different search features available before beginning to use a search engine as each engine has its own way of interpreting and manipulating search expressions. Because a search can retrieve many documents, it is common to have a number of hits, but only a few that are relevant to the query submitted. This is called low precision/highrecall. On the other hand, a searcher may be satisfied with having very precise search results, even if a very small set of hits is returned. This is defined as high precision/low recall. Ideally, the search engine would retrieve all of the relevant documents that are needed. This would be described as high precision/high recall. Search engines support many search features, though not all engines support each one. If they do support certain features, they may use different syntax in expressing them. Before using a search feature, the user should always check the search engine's help pages to understand how the feature is expressed, if it is supported at all. Some examples of search syntax and features used by search engines are: Boolean operators (and, or, not), implied Boolean operators (+ and -), phrase searching, natural language searching, proximity searching, truncation, and field searching.

Types of Search Engines

Search engines can be divided into three basic types: general or major search engines, meta-search engines, and specialty search engines. Each of the major search engines attempts to do the same thing—index as much of the web as possible—so they handle a huge amount of data. Due to this tremendous amount of information, it is common for documents of little useful content to be picked up, making the quality of the ranking scheme used very important. In most first-generation search engines, such as AltaVista and HotBot, results are ranked by relevancy. Relevancy is determined by algorithms that usually count how many times the keywords typed in the search form appear in the documents that exist in the database. Second-generation tools such as Vivisimo, Google, and Direct Hit, use ranking algorithms that use techniques such as grouping and sorting results, importance or popularity of web sites, and human judgment from prior searches. Meta-search engines are tools that search more than one search engine or directory at once, compiling the results and consolidating them into an overall list.

Examples of meta-search engines are Metacrawler, Vivisimo, and Search.com. One drawback of meta-search engines is that they do not include all of the search engines possible, and they are unpredictable in how they handle complex searches. They can be useful for obscure searches.

Specialty search engines, or specialized databases, are search tools that focus on particular subjects, or types of file format (e.g. images or music files). These databases can be time savers because their databases are much smaller and focused on a particular subject area, or type of resource. For example, if a certain legal opinion is needed, a searcher would achieve greater success with FindLaw <http://www.findlaw.com> rather than spending the time in a major search engine such as AltaVista looking through perhaps hundreds of results.

Difficulties and Benefits of Major Search Engines

Search engines send their spiders to crawl the web periodically, so there may be infrequent updates and new sites may not be immediately added. Specialty search engines may be better for very current, dynamically changing information, such as fast-breaking news stories. There is evidence that the major search engines realize this problem and are starting to team with specialty services that provide recent news. For example,AltaVista uses the Moreover news service to provide users with news stories. Another difficulty is that according to a 1999 study by Steve Lawrence and C. Lee Giles, only 16 percent of the web is indexed. Besides content that cannot be gathered by search engine spiders, such as dynamically generated web pages, and pages that contain no hyperlinks, and certain file types, there is also evidence that commercial sites are more often indexed than non-commercial sites. This part of the web that is hidden from the major search engines is often referred to as the invisible web.

Search engines, such as Alta Vista, give users the means to search the Internet using keywords and phrases, yet also provide links to news, sports stories, stock quotes, and other information.Search engines, such as Alta Vista, give users the means to search the Internet using keywords and phrases, yet also provide links to news, sports stories, stock quotes, and other information.

Another difficulty is that information found in major search engines has not been evaluated. The responsibility is placed upon the individual to evaluate what is found. These drawbacks should not detract from the benefits of these major search tools, however. Many general or major search engines,realizing the added benefit of human-managed information, include directories such as the Open Directory Project, in conjunction with the computerized indexes. And some directories, such as Yahoo!, employ search engines to search the web when their directories fail to provide the resources needed by the searcher. The usefulness of being able to search for obscure topics, multi-faceted subjects, specific web pages and sites, in addition to information from specific dates, languages, news stories, images, and more, makes search engines necessary tools for the searcher to learn and use.

Popular Search Engines

Some of the most popular search engines include:

Karen Hartman

Information Access; Information Overload; Information Retrieval; World Wide Web.

Bibliography

Ackermann, Ernest, and Karen Hartman. Internet and Web Essentials: What You Need to Know. Wilsonville, OR: Franklin, Beedle, and Associates, 2001.

Cohen, Laura. "Searching the Web: The Human Element Emerges." Choice Supplement 37 (2000): 17-30.

King, David. "Specialized Search Engines: Alternatives to the Big Guys." Online 24, no. 3 (2000): 67-74.

Lawrence, Steve, and C. Lee Giles. "Accessibility and Distribution of Information on the Web." Nature 400, no. 6740 (1999): 107-109.

Snow, Bonnie. "The Internet's Hidden Content and How to Find It." Online 24, no. 3 (2000): 61-66.

Internet Resources

Lawrence, Steve, and C. Lee Giles. "Accessibility and Distribution of Information on the Web." <http://www.wwwmetrics.com/� 3E;

Sullivan, Danny. "How Search Engines Work." SearchEngineWatch.com. <http://searchenginewatch.com/we bmasters/work.html>

——. "Search Engine Features for Searchers." SearchEngineWatch.com. <http://searchenginewatch.com/fa cts/ataglance.html>

This complete Search Engines contains 1,456 words.

Purchase our Search Engines article Search Engines article
Read the rest of this article.
This article contains 1,607 words (approx. 5 pages at 300 words per page).
Ask any question on Search engine and get it answered FAST!
Answer questions in BookRags Q&A and earn points toward
discounted or even FREE Study Guides and other BookRags products!
Learn more about BookRags Q&A
Copyrights
Search Engines from Macmillan Science Library: Computer Sciences. Copyright © 2001-2006 by Macmillan Reference USA, an imprint of the Gale Group. All rights reserved.

Join BookRagslearn moreJoin BookRags

Join BookRagslearn moreJoin BookRags