Saturday 7 July 2012

Principle of Search Engine

Basic Principle of Search Engine

it also called as architecture of search engine.

(1)spider- it is a browser like program that downloads web pages.

  Difference between spider and browser

- the difference is that a browser displays the information presented on each page(text,graphics, etc) 
  while a spider does not have any visual components and works directly with the underlying html code of                  page.

(2)Crawler- A program that automatically follows all of links on each web page , that downloaded by spider.

(3)Indexer:- A program that analyzes web pages downloaded by the spider and follow by the crawler.    such as text, headers,structural or stylistic features, special html tags, etc

(4)Database:- this is storage area of the data that search engines downloads and analyzes.
 some times it is called the index of the search engine.

(5)Result Engine:- Extracts search results from the database. the result engine ranks pages. it determine which pages best match a user`s query and in what order the page should be listed . this is done according to the ranking algorithms of the search engines.

(6)Web Server:- The search engine web server usually contains a HTML page with an input fields where the user can specify the search query he or she is interested in. the web server is also responsible for displaying search results to the user in the form of an HTML page.

No comments: