KEMBAR78
Web Search Engine | PDF | Web Search Engine | Baidu
0% found this document useful (0 votes)
1K views26 pages

Web Search Engine

The document discusses web search engines, including how they work by crawling websites, indexing pages, and searching the index in response to user queries. It also provides examples of major search engines like Google, Bing, Yahoo, and others.

Uploaded by

khanehtsamkhan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views26 pages

Web Search Engine

The document discusses web search engines, including how they work by crawling websites, indexing pages, and searching the index in response to user queries. It also provides examples of major search engines like Google, Bing, Yahoo, and others.

Uploaded by

khanehtsamkhan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 26

Web search engine

A web search engine is designed to search for information on the World Wide Web and FTP servers.
The search results are generally presented in a list of results and are often called hits. The information
may consist of web pages, images, information and other types of files. Some search engines also mine
data available in databases or open directories. Unlike Web directories, which are maintained by human
editors, search engines operate algorithmically or are a mixture of algorithmic and human input.
How web search engines work:-
A search engine operates, in the following order

1. Web crawling
2. Indexing
3. Searching

Web search engines work by storing information about many web pages, which they retrieve from the
html itself. These pages are retrieved by a Web crawler (sometimes also known as a spider) — an
automated Web browser which follows every link on the site. Exclusions can be made by the use
ofrobots.txt. The contents of each page are then analyzed to determine how it should be indexed (for
example, words are extracted from the titles, headings, or special fields called meta tags). Data about
web pages are stored in an index database for use in later queries. A query can be a single word. The
purpose of an index is to allow information to be found as quickly as possible. Some search engines, such
as Google, store all or part of the source page (referred to as a cache) as well as information about the
web pages, whereas others, such as AltaVista, store every word of every page they find. This cached
page always holds the actual search text since it is the one that was actually indexed, so it can be very
useful when the content of the current page has been updated and the search terms are no longer in it.
This problem might be considered to be a mild form of linkrot, and Google's handling of it
increases usability by satisfying user expectations that the search terms will be on the returned
webpage. This satisfies the principle of least astonishment since the user normally expects the search
terms to be on the returned pages. Increased search relevance makes these cached pages very useful,
even beyond the fact that they may contain data that may no longer be available elsewhere.
When a user enters a query into a search engine (typically by using key
words), the engine examines its index and provides a listing of best-matching web pages according to its
criteria, usually with a short summary containing the document's title and sometimes parts of the text.
The index is built from the information stored with the data and the method by which the information is
indexed. Unfortunately, there are currently no known public search engines that allow documents to be
searched by date. Most search engines support the use of the boolean operators AND, OR and NOT to
further specify the search query. Boolean operators are for literal searches that allow the user to refine
and extend the terms of the search. The engine looks for the words or phrases exactly as
entered. Some search engines provide an advanced feature called proximity search which allows users
to define the distance between keywords. There is also concept-based searching where the research
involves using statistical analysis on pages containing the words or phrases you search for. As well,
natural language queries allow the user to type a question in the same form one would ask it to a
human. A site like this would be ask.com.
The usefulness of a search engine depends on the relevance of the result set it
gives back. While there may be millions of web pages that include a particular word or phrase, some
pages may be more relevant, popular, or authoritative than others. Most search engines employ
methods to rank the results to provide the "best" results first. How a search engine decides which pages
are the best matches, and what order the results should be shown in, varies widely from one engine to
another. The methods also change over time as Internet usage changes and new techniques evolve.
There are two main types of search engine that have evolved: one is a system of predefined and
hierarchically ordered keywords that humans have programmed extensively. The other is a system that
generates an "inverted index" by analyzing texts it locates. This second form relies much more heavily
on the computer itself to do the bulk of the work.
Most Web search engines are commercial ventures supported
by advertising revenue and, as a result, some employ the practice of allowing advertisers to pay money
to have their listings ranked higher in search results. Those search engines which do not accept money
for their search engine results make money by running search related ads alongside the regular search
engine results. The search engines make money every time someone clicks on one of these ads.
Some examples of Search engines:-
1.ask.com(known as Ask Jeeves in the UK)

2.Baidu (Chinese, Japanese)

3.Bing (formerly MSN Search and Live Search)

4.Duck Duck Go

5.Google

6.Kosmix

7.Sogou (Chinese)

8.Yodao (Chinese)

9.Yahoo! Search

10.Yandex (Russian)

11.Yebol
www.ask.com

It is a search engine which is founded in 1996,whose headquarter is in Oakland,California,US. The name


of founder of ask.com is David warthen and garrett Gruener.
The original search engine software was implemented by Gary Chevsky from his own design.
Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine. Three
venture capital firms, Highland Capital Partners, Institutional Venture Partners, and The RODA Group
were early investors.[2] Ask.com is currently owned by InterActiveCorp under the NASDAQ symbol IACI.

History:-
Ask.com was originally known as Ask Jeeves, where "Jeeves" is the name of the "gentleman's personal
gentleman", or valet, fetching answers to any question asked. The character was based on Jeeves, Bertie
Wooster's fictional valet from the works of P. G. Wodehouse.
The original idea behind Ask Jeeves was to allow users to get answers to questions posed in
everyday, natural language, as well as traditional keyword searching. The current Ask.com still supports
this, with added support for math, dictionary, and conversion questions.
In 2005, the company announced plans to phase out Jeeves. On February 27, 2006, the character
disappeared from Ask.com, and was stated to be "going in to retirement." The UK & Ireland edition of the
website prominently brought the character back in 2009, however American visitors can go to the
'uk.ask.com' URL to see the new Jeeves as a 'skin', or background image.
InterActiveCorp owns a variety of sites including country-specific sites for UK, Germany, Italy,Japan,
the Netherlands, and Spain along with Ask Kids, Teoma (now ExpertRank) and several others (see this
page for a complete list). On June 5, 2007 Ask.com relaunched with a 3D look.
On May 16, 2006, Ask implemented a "Binoculars Site Preview" into its search results. On search results
pages, the "Binoculars" let searchers capture a sneak peak of the page they could visit with a mouse-
over activating screenshot pop-up.
In December 2007, Ask released the AskEraser feature, allowing users to opt-out from tracking of search
queries and IP and cookie values. They also vowed to erase this data after 18 months if the AskEraser
option is not set. HTTP cookies must be enabled for AskEraser to function.
On July 4, 2008 InterActiveCorp announced the acquisition of Lexico Publishing Group, which
owns Dictionary.com, Thesaurus.com, and Reference.com.
On July 26th 2010, Ask.com released a closed-beta Q&A service. The service was released to the public
on July 29th 2010.

Toolbar
The Ask.com Toolbar is a free internet browser toolbar from Ask.com, available for both the Internet
Explorer and Firefox web browsers. The toolbar is frequently confused with the MyWay Searchbar,
however the Ask Toolbar and the MyWay Searchbar are separate programs.
Features include the web, image, news, dictionary searches, a wide variety of US and international
content served in widgets, weather forecasts, RSS/ATOM feeds and related services.
The Ask Toolbar can be installed from the toolbar.ask.com website, but is also bundled with certain 3rd
party software. The installation of the Ask Toolbar is optional to the user and always requires end user
consent (in the form of an "Opt-Out" check box) when bundled with other 3rd party software. The
Ask.com toolbar has been reported by some to contain spyware or adware, and has also been reported
by some to be a virus (though it is not, it is simply a potentially unwanted program). [20] The program itself
however, could be considered adware, because it is bundled as an ad to certain software installations.
Dekker’s algorithm of Ask.com
Dekker's algorithm is the first known correct solution to the mutual exclusion problem in concurrent
programming. The solution is attributed to Dutch mathematician Th. J. Dekker by Edsger W. Dijkstrain his
manuscript on cooperating sequential processes[1]. It allows two threads to share a single-use resource
without conflict, using only shared memory for communication.
It avoids the strict alternation of a naive turn-taking algorithm, and was one of the first mutual exclusion
algorithms to be invented
Pseudocode

flag[0] := false
flag[1] := false
turn := 0 // or 1

p0: p1:

flag[0] := true flag[1] := true


while flag[1] = true { while flag[0] = true {
if turn ≠ 0 { if turn ≠ 1 {
flag[0] := false flag[1] := false
while turn ≠ 0 { while turn ≠ 1 {
} }
flag[0] := true flag[1] := true
} }
} }

// critical section // critical section


... ...
turn := 1 turn := 0
flag[0] := false flag[1] := false
// remainder section // remainder section

Processes indicate an intention to enter the critical section which is tested by the outer while loop. If the
other process has not flagged intent, the critical section can be entered safely irrespective of the current
turn. Mutual exclusion will still be guaranteed as neither process can become critical before setting their
flag (implying at least one process will enter the while loop). This also guarantees progress as waiting will
not occur on a process which has withdrawn intent to become critical. Alternatively, if the other process's
variable was set the while loop is entered and the turn variable will establish who is permitted to become
critical. Processes without priority will withdraw their intention to enter the critical section until they are
given priority again (the inner while loop). Processes with priority will break from the while loop and
enter their critical section.
Dekker's algorithm guarantees mutual exclusion, freedom from deadlock, and freedom from starvation.
Let us see why the last property holds. Suppose p0 is stuck inside the "while flag[1]" loop forever. There
is freedom from deadlock, so eventually p1 will proceed to its critical section and set turn = 0 (and the
value of turn will remain unchanged as long as p0 doesn't progress). Eventually p0 will break out of the
inner "while turn ≠ 0" loop (if it was ever stuck on it). After that it will set flag[0] := true and settle down
to waiting for flag[1] to become false (since turn = 0, it will never do the actions in the while loop). The
next time p1 tries to enter its critical section, it will be forced to execute the actions in its "while flag[0]"
loop. In particular, it will eventually set flag[1] = false and get stuck in the "while turn ≠ 1" loop (since
turn remains 0). The next time control passes to p0, it will exit the "while flag[1]" loop and enter its
critical section.
If the algorithm were modified by performing the actions in the "while flag[1]" loop without checking if
turn = 0, then there is a possibility of starvation. Thus all the steps in the algorithm are necessary.

Google Search
Google Search or Google Web Search is a web search engine owned by Google Inc. and is the most-
used search engine on the Web. Google receives several hundred million queries each day through its
various services. The main purpose of Google Search is to hunt for text in webpages, as opposed to other
data, such as with Google Image Search. Google search was originally developed by Larry
Page and Sergey Brin in 1997.
Google Search provides at least 22 special features beyond the original word-search capability. These
include synonyms, weather forecasts, time zones, stock quotes, maps, earthquake data, movie
showtimes, airports, home listings, and sports scores. (see below: Special features). There are special
features for numbers, including ranges (70..73), prices, temperatures, money/unit conversions ("10.5 cm
in inches"), calculations ( 3*4+sqrt(6)-pi/2 ), package tracking, patents, area codes, and language
translation of displayed pages.
The order of search results (ghits for Google hits) on Google's search-results pages is based, in part, on
a priority rank called a "PageRank". Google Search provides many options for customized search (see
below: Search options), using Boolean operators such as: exclusion ("-xx"), inclusion ("+xx"), alternatives
("xx OR yy"), and wildcard ("x * x").

PageRank

Google's rise to success was in large part due to a patented algorithm called PageRank that helps rank
web pages that match a given search string. [9] Previous keyword-based methods of ranking search
results, used by many search engines that were once more popular than Google, would rank pages by
how often the search terms occurred in the page, or how strongly associated the search terms were
within each resulting page. The PageRank algorithm instead analyses human-generated links, assuming
that web pages linked from many important pages are themselves likely to be important. The algorithm
computes a recursive score for pages, based on the weighted sum of the PageRanks of the pages linking
to them. PageRank is thought to correlate well with human concepts of importance. In addition to
PageRank, Google over the years has added many other secret criteria for determining the ranking of
pages on result lists, reported to be over 200 different indicators. [10][11] The details are kept secret due to
spammers and in order to maintain an advantage over Google's competitors.
PageRank is a link analysis algorithm, named after Larry Page, used by the Google Internet search
engine that assigns a numerical weighting to each element of a hyperlinked setof documents, such as
theWorld Wide Web, with the purpose of "measuring" its relative importance within the set.
The algorithm may be applied to any collection of entities with reciprocalquotations and references. The
numerical weight that it assigns to any given element E is referred to as the PageRank of E and denoted
by PR(E).
Description of page rank

“ PageRank reflects our view of the importance of web pages by considering more than
500 million variables and 2 billion terms. Pages that we believe are important pages
receive a higher PageRank and are more likely to appear at the top of the search
results.

PageRank also considers the importance of each page that casts a vote, as votes from
some pages are considered to have greater value, thus giving the linked page greater
value. We have always taken a pragmatic approach to help improve search quality and
create useful products, and our technology uses the collective intelligence of the web to
determine a page's importance.

Algorithm
PageRank is a probability distribution used to represent the likelihood that a person randomly clicking on
links will arrive at any particular page. PageRank can be calculated for collections of documents of any
size. It is assumed in several research papers that the distribution is evenly divided among all documents
in the collection at the beginning of the computational process. The PageRank computations require
several passes, called "iterations", through the collection to adjust approximate PageRank values to
more closely reflect the theoretical true value.
A probability is expressed as a numeric value between 0 and 1. A 0.5 probability is commonly expressed
as a "50% chance" of something happening. Hence, a PageRank of 0.5 means there is a 50% chance that
a person clicking on a random link will be directed to the document with the 0.5 PageRank.
Simplified algorithm

Assume a small universe of four web pages: A, B, C and D. The initial approximation of PageRank would
be evenly divided between these four documents. Hence, each document would begin with an estimated
PageRank of 0.25.
In the original form of PageRank initial values were simply 1. This meant that the sum of all pages was
the total number of pages on the web. Later versions of PageRank (see the formulas below) would
assume a probability distribution between 0 and 1. Here a simple probability distribution will be used-
hence the initial value of 0.25.
If pages B, C, and D each only link to A, they would each confer 0.25 PageRank to A. All
PageRank PR( ) in this simplistic system would thus gather to A because all links would be pointing to A.
This is 0.75.
Suppose that page B has a link to page C as well as to page A, while page D has links to all three
pages. The value of the link-votes is divided among all the outbound links on a page. Thus,
page Bgives a vote worth 0.125 to page A and a vote worth 0.125 to page C. Only one third of D's
PageRank is counted for A's PageRank (approximately 0.083).

In other words, the PageRank conferred by an outbound link is equal to the document's own
PageRank score divided by the normalized number of outbound links L( ) (it is assumed that
links to specific URLs only count once per document).

In the general case, the PageRank value for any page u can be expressed as:

,
i.e. the PageRank value for a page u is dependent on the PageRank values for each
page v out of the set Bu (this set contains all pages linking to page u), divided by the
number L(v) of links from page v.

Search results
The exact percentage of the total of web pages that Google indexes is not known, as it is very hard to
actually calculate. Google not only indexes and caches web pages but also takes "snapshots" of other file
types, which include PDF, Word documents, Excel spreadsheets, Flash SWF, plain text files, and so on.
[12]
Except in the case of text and SWF files, the cached version is a conversion to (X)HTML, allowing
those without the corresponding viewer application to read the file.
Users can customize the search engine, by setting a default language, using the "SafeSearch" filtering
technology and set the number of results shown on each page. Google has been criticized for placing
long-term cookies on users' machines to store these preferences, a tactic which also enables them to
track a user's search terms and retain the data for more than a year. For any query, up to the first 1000
results can be shown with a maximum of 100 displayed per page. The ability to specify the number of
results is available only if "Instant Search" is not enabled. If "Instant Search" is enabled, only 10 results
are displayed, regardless of this setting.

Google optimization

Since Google is the most popular search engine, many webmasters have become eager to influence
their website's Google rankings. An industry of consultants has arisen to help websites increase their
rankings on Google and on other search engines. This field, calledsearch engine optimization, attempts
to discern patterns in search engine listings, and then develop a methodology for improving rankings to
draw more searchers to their client's sites.
Search engine optimization encompasses both "on page" factors (like body copy, title
elements, H1 heading elements and image alt attribute values) and Off Page Optimization factors
(like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by
incorporating the keywords being targeted in various places "on page", in particular the title element and
the body copy (note: the higher up in the page, presumably the better its keyword prominence and thus
the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to
Google's spam checking algorithms.
Google has published guidelines for website owners who would like
to raise their rankings when using legitimate optimization consultants.
Functionality

Google search consists of a series of localized websites. The largest of those, the google.com site, is the
top most-visited website in the world.[15] Some of its features include a definition link for most searches
including dictionary words, the number of results you got on your search, links to other searches (e.g. for
words that Google believes to be misspelled, it provides a link to the search results using its proposed
spelling), and many more.
Search syntax
Google's search engine normally accepts queries as a simple text, and breaks up the user's text into a
sequence of search terms, which will usually be words that are to occur in the results, but one can also
use Boolean operators, such as: quotations marks (") for a phrase, a prefix such as "+", "-" for qualified
terms, or one of several advanced operators, such as "site:". The webpages of "Google Search Basics"
describe each of these additional queries and options (see below: Search options).
Google's Advanced Search web form gives several additional fields which may be used to qualify
searches by such criteria as date of first retrieval. All advanced queries transform to regular queries,
usually with additional qualified terms.
Query expansion
Google applies query expansion to the submitted search query, transforming it into the query that will
actually be used to retrieve results. As with page ranking, the exact details of the algorithm Google uses
are deliberately obscure, but certainly the following transformations are among those that occur:

Term reordering: in information retrieval this is a standard technique to reduce the work involved
in retrieving results. This transformation is invisible to the user, since the results ordering uses the
original query order to determine relevance.

Stemming is used to increase search quality by keeping small syntactic variants of search terms.[16]

There is a limited facility to fix possible misspellings in queries.

Search options
The webpages maintained by the Google Help Center have text describing more than 15 various search
options.The Google operators:

 OR – Search for either one, such as "price high OR low" searches for "price" with "high" or "low".
 "-" – Search while excluding a word, such as "apple -tree" searches where word "tree" is not used.
 "+" – Force inclusion of a word, such as "Name +of +the Game" to require the
words "of" & "the" to appear on a matching page.
 "*" – Wildcard operator to match any words between other specific words.

Some of the query options are as follows:

 define: – The query prefix "define:" will provide a definition[21] of the words listed after it.
 stocks: – After "stocks:" the query terms are treated as stock ticker symbols[21] for lookup.
 site: – Restrict the results to those websites in the given domain, [21] such as,
site:www.acmeacme.com. The option "site:com" will search all domain URLs named with ".com" (no
space after "site:").
 allintitle: – Only the page titles are searched[21] (not the remaining text on each webpage).
 intitle: – Prefix to search in a webpage title,[21] such as "intitle:google search" will list pages with
word "google" in title, and word "search" anywhere (no space after "intitle:").
 allinurl: – Only the page URL address lines are searched[21] (not the text inside each webpage).
 inurl: – Prefix for each word to be found in the URL;[21] others words are matched anywhere, such
as "inurl:acme search" matches "acme" in a URL, but matches "search" anywhere (no space after
"inurl:").

The page-display options (or query types) are:

 cache: – Highlights the search-words within the cached document, such as


"cache:www.google.com xxx" shows cached content with word "xxx" highlighted.
 link: – The prefix "link:" will list webpages that have links to the specified webpage, such as
"link:www.google.com" lists webpages linking to the Google homepage.
 related: – The prefix "related:" will list webpages that are "similar" to a specified web page.
 info: – The prefix "info:" will display some background information about one specified webpage,
such as, info:www.google.com. Typically, the info is the first text (160 bytes, about 23 words)
contained in the page, displayed in the style of a results entry (for just the 1 page as matching the
search).
 filetype: - results will only show files of the desired type (ex filetype:pdf will return pdf files)

Note that Google searches the HTML coding inside a webpage, not the screen appearance: the words
displayed on a screen might not be listed in the same order in the HTML coding.

Bing (search engine)

Bing (formerly Live Search, Windows Live Search, and MSN Search) is the current web search
engine (advertised as a "decision engine")[2] from Microsoft. Bing was unveiled by Microsoft CEO Steve
Ballmer on May 28, 2009 at the All Things Digital conference in San Diego. It went fully online on June 3,
2009,[3] with a preview version released on June 1, 2009.
Notable changes include the listing of search suggestions as queries are entered and a list of related
searches (called "Explorer pane") based on[4] semantic technology from Powersetthat Microsoft
purchased in 2008.[5] As of June 2010, Bing is the third largest search engine on the web by query
volume, at 3.24%, after its competitor Google at 84.80% and Yahoo at 6.19%, according to Net
Applications.[6]
On July 29, 2009, Microsoft and Yahoo! announced a deal in which Bing would power Yahoo! Search

MSN/WINDOWLIVE Search engine :-


MSN Search was a search engine by Microsoft that comprised a search engine, index, and web
crawler. MSN Search first launched in the third quarter of 1998 and used search results from Inktomi. In
early 1999, MSN Search launched a version which displayed listings from Looksmart blended with results
from Inktomi except for a short time in 1999 when results from AltaVista were used instead. Since then
Microsoft upgraded MSN Search to provide its own self-built search engine results, the index of which
was updated weekly and sometimes daily. The upgrade started as a beta program in November 2004,
and came out of beta in February 2005. Image search was powered by a third party,Picsearch. The
service also started providing its search results to other search engine portals in an effort to better
compete in the market.
Windows Live Search

The first public beta of Windows Live Search was unveiled on March 8, 2006, with the final release on
September 11, 2006 replacing MSN Search. The new search engine used search tabs that include Web,
news, images, music, desktop, local, and Microsoft Encarta.
In the roll-over from MSN Search to Windows Live Search, Microsoft stopped using Picsearchas their
image search provider and started performing their own image search, fueled by their own internal
image search algorithms.[8]
Live Search

On March 21, 2007, Microsoft announced that it would separate its search developments from
the Windows Live services family, rebranding the service as Live Search. Live Search was integrated into
the Live Search and Ad Platform headed by Satya Nadella, part of Microsoft's Platform and Systems
division. As part of this change, Live Search was merged with Microsoft adCenter.[9]
A series of reorganisations and consolidations of Microsoft's search offerings were made under the Live
Search branding. On May 23, 2008, Microsoft announced the discontinuation of Live Search
Books and Live Search Academic and integrated all academic and book search results into regular
search, and as a result this also included the closure of Live Search Books Publisher Program. Soon
after, Windows Live Expo was discontinued on July 31, 2008. Live Search Macros, a service for users to
create their own custom search engines or use macros created by other users, was also discontinued
shortly after. On May 15, 2009, Live Product Upload, a service which allowed merchants to upload
products information ontoLive Search Products, was discontinued. The final reorganisation came as Live
Search QnA was rebranded as MSN QnA on February 18, 2009, however, it was subsequently
discontinued on May 21, 2009.
Microsoft recognised that there would be a brand issue as long as the word "Live" remained in the name.
As an effort to create a new identity for Microsoft's search services, Live Search was officially replaced by
Bing on June 3, 2009.
Live Search
Since 2006, Microsoft had conducted a number of tie-ins and promotions for promoting Microsoft's
search offerings. These include:
 Amazon's A9 search service and the experimental Ms. Dewey interactive search site syndicated
all search results from Microsoft's then search engine, Live Search. This tie-in started on May 1, 2006.
 Search and Give - a promotional website launched on 17 January 2007 where all searches done
from a special portal site would lead to a donation to the UNHCR's organization for refugee children,
ninemillion.org. Reuters AlertNet reported in 2007 that the amount to be donated would be $0.01 per
search, with a minimum of $100,000 and a maximum of $250,000 (equivalent to 25 million
searches).[30]According to the website the service was decommissioned on June 1, 2009, having
donated over $500,000 to charity and schools.[31]
 Club Bing - a promotional website where users can win prizes by playing word games that
generate search queries on Microsoft's then search service Live Search. This website began in April
2007 as Live Search Club.
 Big Snap Search - a promotional website similar to Live Search Club. This website began in
February 2008, but was discontinued shortly after.[32]
 Live Search SearchPerks! - a promotional website which allowed users to redeem tickets for prizes
while using Microsoft's search engine. This website began on October 1, 2008 and was
decommissioned on April 15, 2009.

Yahoo search engine:-

Yahoo! Search is a web search engine, owned by Yahoo! Inc. and was as of December 2009, the 2nd
largest search engine on the web by query volume, at 6.29%, after its competitor Google at 85.35% and
before Bing at 3.27%, according to Net Applications.
Originally, Yahoo! Search started as a web directory of other websites, organized in a hierarchy, as
opposed to a searchable index of pages. In the late 1990s,Yahoo! evolved into a full-fledged portal with a
search interface and, by 2007, a limited version of selection-based search.
Yahoo! Search, originally referred to as Yahoo! provided Search interface, would send queries to a
searchable index of pages supplemented with its directory of sites. The results were presented to the
user under the Yahoo! brand. Originally, none of the actual web crawling and storage/retrieval of data
was done by Yahoo! itself. In 2001 the searchable index was powered by Inktomi and later was powered
by Google until 2004, when Yahoo! Search became independent.
On July 29, 2009, Microsoft and Yahoo! announced a deal in which Bing would power Yahoo! Search.. All
Yahoo! Search global customers and partners are expected to be transitioned by early 2012

Search technology acquisition


In 2002, they bought Inktomi, a "behind the scenes" or OEM search engine provider, whose results are
shown on other companies' websites and powered Yahoo! in its earlier days. In 2003, they
purchased Overture Services, Inc., which owned the AlltheWeb andAltaVista search engines. Initially,
even though Yahoo! owned multiple search engines, they didn't use them on the main yahoo.com
website, but kept using Google's search engine for its results.
Starting in 2003, Yahoo! Search became its own web crawler-based search engine, with a reinvented
crawler called Yahoo! Slurp. Yahoo! Search combined the capabilities of all the search engine companies
they had acquired, with its existing research, and put them into a single search engine. The new search
engine results were included in all of Yahoo!'s sites that had a web search function. Yahoo! also started
to sell the search engine results to other companies, to show on their own web sites. Their relationship
with Google was terminated at that time, with the former partners becoming each other's main
competitors.
In October 2007, Yahoo! Search was updated with a more modern appearance in line with the
redesigned Yahoo! home page. In addition,Search Assist was added; which provides real-time query
suggestions and related concepts as they are typed.
In July 2008, Yahoo! Search announced the introduction of a new service called "Build Your Own Search
Service," or BOSS. This service opens the doors for developers to use Yahoo!'s system for indexing
information and images and create their own custom search engine.[5]
In July 2009, Yahoo! signed a deal with Microsoft, the result of which was that Yahoo! Search would be
powered by Bing. This is now in effect. (citation/source is contained within the existing Wikipedia article).

Yahoo! Search blog & announcements


The team at Yahoo! Search frequently blogged about search announcements, features, updates and
enhancements. The Yahoo! Search Blog, as stated provided A look inside the world of search from the
people at Yahoo!.[6] This included index updates named Weather Updates and their Yahoo! Search
Assist feature.

Gur uji.com

Guruji.com

Guruji.com is an Indian Internet search engine that is focused on providing better search results to
Indian consumers, by leveraging proprietary algorithms and data in the Indian context.

Search products

In addition to its tool for searching webpages, Guruji Search also provides services for searching
images, services within a specific city, music, cricket, movie timing and finance (world markets and
such). These are linked from the main search page.

Origin of name

Guruji.com is derived from the Sanskrit word for teacher, Guru. The search engine aims at giving its
"students"(its users) what they require. guruji.com gives pages from India as search results, thus proving
to be a search engine By Indians, for Indians. Guruji.com was launched to the world on October 16, 2006.
AltaV ista

AltaVista is a web search engine owned by Yahoo!. AltaVista was once one of the most popular search
engines but its popularity declined with the rise ofGoogle.

Origins
AltaVista was created by researchers at Digital Equipment Corporation's Western Research Laboratory
who were trying to provide services to make finding files on the public network easier. Although there is
some dispute about who was responsible for the original idea,[2]two key participants were Louis Monier,
who wrote thecrawler, and Michael Burrows, who wrote the indexer. The name AltaVista was chosen in
relation to the surroundings of their company at Palo Alto. AltaVista was publicly launched as an internet
search engine on 15 December 1995 at altavista.digital.com.
At launch, the service had two innovations which set it ahead of the other search engines; It used a fast,
multi-threaded crawler (Scooter) which could cover many more Web pages than were believed to exist at
the time and an efficient search running back-end on advanced hardware.

LYCOS

Lycos is a search engine and web portalestablished in 1994. Lycos also encompasses a network of
email, webhosting, social networking, and entertainment websites.
Lycos began as a research project by Michael Loren Mauldin of Carnegie Mellon University in
1994.Lycos Inc. was formed with approximately US $2 million in venture capital funding from CMGI. Bob
Davis became the CEO and first employee of the new company in 1995. After unsuccessfully attempting
to turn the business into a software company selling an enterprise version of the search software, Davis
concentrated on building the company into an advertising-supported web portal. Lycos enjoyed several
years of growth during the 1990s and became the most visited online destination in the world in 1999,
with a global presence in more than 40 countries.
ALL THE WEB

AlltheWeb is an Internet search engine that made its debut in mid-1999. It grew out of FTP Search, Tor
Egge'sdoctorate thesis at the Norwegian University of Science and Technology, which he started on in
1994, which in turn resulted in the formation of Fast Search and Transferestablished on July 16, 1997.[1] It
was used primarily as a show piece site for FAST's enterprise search engine. Although at one time
rivaling Google in size and technology,[2] AlltheWeb never became as popular.
When AlltheWeb started in 1999, Fast Search and Transfer aimed to provide their database to other
search engines, copying the successful case of Inktomi. Indeed, in January 2000, Lycos used their results
in the Lycos PRO search. By that time, the AlltheWeb database had grown from 80 millionURIs to 200
million. Their aim was to index all the publicly-accessible web. Their crawler indexed over 2 billion pages
by June 2002[2] and started a fresh round of the search engine size war. Before their purchase by Yahoo!,
the database contained about 3.3 billion URIs.

BAIDU

Baidu, Inc. (Chinese: 百 度 ; pinyin: Bǎidù,NASDAQ: BIDU), simply known as Baidu and incorporated on
January 18, 2000, is a Chinese search engine for websites, audio files, and images. Baidu offers 57
search and community services including Baidu Baike, an online collaboratively-built encyclopedia, and a
searchable keyword-based discussion forum.[4] Baidu was established in 2000 by co-founders, Robin
Li and Eric Xu. Both of the co-founders are Chinese nationals who have studied and worked overseas
before returning to China. Baidu.com Inc. is registered in the Cayman Islands.[5] In April 2010, Baidu
ranked 7th overall inAlexa's internet rankings.[6] In December 2007, Baidu became the first Chinese
company to be included in theNASDAQ-100 index
Name

“ Many people have asked about the meaning of our name. 'Baidu' was inspired by
a poem written more than 800 years ago during the Song Dynasty. The poem
compares the search for a retreating beauty amid chaotic glamour with the search
for one's dream while confronted by life's many obstacles. '...hundreds and
thousands of times, for her I searched in chaos, suddenly, I turned by chance, to
where the lights were waning, and there she stood.' Baidu, whose literal meaning
is hundreds of times, represents persistent search for the ideal. ”

Domain name hack


On 12 January 2010, Baidu.com's DNS records in the United States were altered such that browsers to
baidu.com were redirected to a website purporting to be the Iranian Cyber Army, thought to be behind
the attack on Twitter during the 2009 Iranian election protests, making the actual site unusable for four
hours.[21] Internet users were met with a page saying "This site has been attacked by Iranian Cyber
Army".[22] Chinese hackers later responded by attacking Iranian websites and leaving messages.[23] Baidu
later launched legal action against Register.com for gross negligence after it was revealed that
Register.com's technical support staff changed the email address for Baidu.com on the request of an
unnamed individual, despite their failing security verification procedures. Once the address had been
changed.

DAUM

Daum (Korean: 다음; KRX: 035720) is a popular web portal in South Korea, like Naver and Nate. Daum
offers many Internet services to web users, including a popular free web-based e-
mail, messaging service,forums, shopping and news. The word "daum" means "next" in Korean.
The popularity of Daum is a reflection of the high level of internet use in South Korea. The country has
the highest level of broadband users, and one of the most widespread levels of computer and Internet
access.
The popularity of Daum stems from the range of services it offers, but also from the fact that it was the
first Korean web portal of significant size. Its popularity started when it merged with the then most
popular e-mail service, daum.net or hanmail.net. After the merging, Daum started the forum
serviceDaumCafe which brought its firm status in the market. The term cafe and even internet
cafe(Different from what is supposed to refer to in Western usage) is now used as the synonym for
"Internet forum" in Korean.
On August 2, 2004 Daum announced the purchase ofLycos for $95.4 million, and closed the transaction
on October 6.[1]
Daum has about 874 employees as of Mar, 2009.
DOGPILE

Dogpile is a metasearch engine that fetches results from Google, Yahoo!, Bing, Ask.com, About.com and
several other popular search engines, including those from audio and video content providers. It is a
registered trademark of InfoSpace, Inc.

History
Dogpile began operation in November 1996. The site was created and developed by Aaron Flin and later
sold to Go2net (which was in turn acquired byInfospace).
The Dogpile search engine earned the J.D. Power and Associates award for best Residential Online
Search Engine Service in both 2006 and 2007. Dogpile started a campaign in 2008 to use proceeds from
site traffic to raise US$1 million for animals in need. In July 2010, Dogpile was ranked the 770th most
popular website in the U.S., and 2548th most popular in the world by Alexa. Quantcast etimated 2.0
million unique U.S. visitors a month, and Compete estimated 1,953,280.
Metasearch
Dogpile is a metasearch site — it searches multiple engines, filters for duplicates and then presents the
results to the user. Dogpile uses multiple popular search engines, as well as sponsored links.
DUCK DUCK GO

Duck Duck Go is a search engine based in Valley Forge, Pennsylvania that uses information from crowd-
sourced sites (like Wikipedia) with the aim of augmenting traditional results and improving relevance.
The search engine philosophy emphasizesprivacy and does not record user information.
History
Duck Duck Go was founded by Gabriel Weinberg, a serial entrepreneur whose last venture (The Names
Database) was acquired by United Online (NASDAQ:UNTD) in 2006 for $10M.[2] Weinberg has a B.S. in
Physics and an M.S. in Technology and Policy from MIT.[3] The project was initially self-funded by
Weinberg and is intended to be advertising supported.[4] The search engine is written in Perl, a Nginx
webserver and runs primarily on FreeBSD.[1]
Duck Duck Go is built primarily upon search APIs from major vendors (such as Yahoo! Search BOSS),
because of this, TechCrunch characterized the service as a "hybrid" search engine.
EXALEAD

Exalead (pronounced /ɛɡˈzæliːd/ in English) is a software company that provides search platforms
and search-based applications (SBA) [1][2] for consumer and business users. The company is
headquartered in Paris, France, and is a subsidiary of Dassault Systèmes [3] (French pronunciation: [da
ˈso], Euronext: DSY).

Public Search Engine & Exalabs


The CloudView product is also the platform for the public Web search engine, which was designed to
apply semantic processing and faceted navigation to Web data volumes and usage. Exalead also
operates an online R&D laboratory, Exalabs, which uses the Web as a medium for developing applied
technologies for business. Exalabs projects include:

 Miiget and Constellations, relationship mapping search engines (semantic mining and
visual modes of presenting search results)

 Wikifier, a module that incorporates Wikipedia information into web page content (data
contextualization for unstructured content)

 Voxalead, an engine for searching within the content of videos (speech-to-text


transcription technologies)

 Chromatik, a color-based image search engine (semantic multimedia search technologies)

 Tweepz, a search engine for finding people who are using Twitter (social search)

 Sourcier, a map-based service providing access to public data on subterranean water


quality (map-centered information access)

Many of Exalabs projects are developed in conjunction with Exalead's


partners in the Quaero [4][5]project.
History
Exalead was founded in 2000 by François Bourdoncle and Patrice Bertin (both of whom were involved in
the development of the Alta Vista search engine), and began commercializing its products in 2005.
Exalead employs approximately 140 people in 6 countries .
Rest of A-z search engine in short details:-
A9.com: It is a search engine from Amazon.com. Its search results are derived from Live Search and Amazon.com.
Apart from having the features that many popular search engines have, A9.com includes results from the books on
Amazon.com and has a unique ability of including searches inside books.

Accoona: This is an Internet company that is based in New Jersey whose main product is a search engine. The unique
feature of their search engine is its capability of using artificial intelligence for refining its searches. The search engine
also offers business profile searches.

Alleba: Designed in 2001 by Andrew dela Serna, a Filipino web designer, Alleba is a Philippine search engine that has
helped many of the Philippine websites to gain an exposure on the Internet.

AltaVista: This is the name of a search engine company and also their search engine product. In December 1995, the
company launched the search engine, AltaVista that offers excellent search facilities as well as free translation services.

Amatomu: It is a blog search engine that mainly retrieves blogs generated in South Africa.

Ansearch.com.au: This search engine is the main product of Ansearch, an Internet company in Australia. It was
previously known by the name, Mysearch.com.au. It focuses on the quality of content, profiles websites and not merely
the listing of web pages and adheres to popularity-driven ranking.

Araby.com: It is an Arabic search engine that is owned by Maktoob Inc, an Arab Internet services company.

Ask.com: The original idea behind this search engine was to provide the users with answers to their day-to-day
questions. This search engine includes the searches from sites like Ask for Kids, Excite and MyWay.com. Formerly
known by the name Ask Jeeves, this search engine is one of the very popular search engines today.

AskMeNow: It is an American public corporation that offers mobile search and mobile advertising facilities to its users.
Its efficient mobile search facilities provide relevant answers to user queries posted in natural language.

Baidu: It is one of the popular Chinese search engines offering searches on websites, audio files and images. It is
packaged with an encyclopedia and a discussion-based forum. It attracts millions of visitors every year.

Blinkx: Blinkx is an Internet search engine that is used for searching audio and video content. It uses the speech
recognition techniques to enable the users to listen to the audio and video media over the Internet.

Bloglines: It is a web-based news aggregator founded by Mark Fletcher, former CEO of ONElist. It was sold to
Ask.com in 2005.

BlogScope: It is a search engine meant for searching blogs and has been able to index about 120 million articles till
2007.

Bioinformatic Harvester: It is a bioinformatic metasearch engine that employs page-ranking strategies similar to
Google.

Bixee.com: It is a job search engine in India that was started in 2005.

Brainboost: It is a metasearch engine that uses the techniques of machine language and natural language processing
and offers answers to questions put in natural language.

BTJunkie: It is a BitTorrent search engine that searches torrent files.

Business.com: It is an Internet search engine and web directory that targets the executive and corporate management
audiences. It focuses on the business-to-business market and employs pay-per-click advertising.

CareerBuilder.com: This is the largest online job site in the United States that offers career search options.

ChaCha: It is a mobile answers service that uses human guidance in answering questions. It is based in Indiana and
was created by Scott A. Jones, an inventor and entrepreneur and Brad Bostic, the chairman of Bostech Corporation.

ChunkIt!: It is a personal search engine that is available in the form of a downloadable add-on for browsers like Internet
Explorer and Firefox.

Clusty: It is a metasearch engine that was developed by Vivisimo.


Cuil: This search engines organizes the search results by the content of the pages and displays long entries with
images for every result. This search engine, which was developed by some of the former employees of Google, claims
to implement a privacy policy wherein the users’ search activities are not stored.

DataparkSearch: This search engine organizes searches within a website, in groups of websites or a network. It is a
free software written in the C language.

Daylife: It is a news aggregator. In simpler terms, it can be called as a news-browsing engine that provides access to
the articles and images from a wide range of resources.

Dogpile: It is Infospace’s flagship metasearch site that serves as a metasearch engine fetching results from a number
of search engines like Google, Yahoo!, Ask.com and others.

EBI’s Search Engine: It searches over the biological databases and aims at providing free bioinformatics services and
training.

Egothor: It is an open source search engine that recognizes many of the common file formats and has a high capacity
crawler. Being written in Java, it enables cross-platform compatibility.

Eluta.ca: It is a job search engine for the job opportunities in Canada. It was established with an intent to make job
announcements that can be searched by every jobseeker in the country.

Entrez: It is a federated search engine that searches over a wide range of databases on topics such as health, genetic
inheritance, biomedicine and population study.

Eurekster: This company based in New Zealand builds social search engines, which are called as ‘swickis’.

Exalead: It provides thumbnail previews of the target pages and facilitates the refinement of search results. It is a
French search engine.

Ex.plode.us: It was initially designed as a social network and was re-launched in 2007 as a people search engine.

FindSounds: It is a website offering searches of more that 1,000,000 sounds and caters to a very large number of
users around the world.

FlixFlux: It is a torrent search site that can be used to search for films and has the unique features like the marking of
sites that are non-relevant to a particular search and providing the users with downloadable trailers of films.

Funnelback: It is an enterprise search engine that is popularly used by the universities and government organizations
in Australia.

GenieKnows: It is a privately owned vertical search engine company that operates on online advertising and business-
to-business transactions.

Geoportal: It is a popular web mapping service that is available in the French language.

goo: Looked after by NTT, a telecom company in Japan, goo is an Internet search engine and a web portal that indexes
websites, which are in the Japanese language.

Google: Owned by Google, Inc, it is the most popular search engine of the present times. It receives several hundred
million search queries everyday. Google Code Search is a free beta product from Google that enables the users to
search open source code over the Internet. Google Maps is an extremely popular web mapping service provided by
Google. It offers a business locater and a route planner for several nations of the world. Google News is a news
aggregator that Google provides. Today, Google News is available in 19 languages and the development continues.

Gonzui: It is a source code search engine software that helps in searching source code written in different languages
like C, C++, Java, Python and others.

GoPubMed: It is a knowledge-based search engine based on searching biomedical texts and proves helpful for
biologists and medical professionals.

Grokker: It is a visual search engine that was developed by Groxis, a company based in San Francisco that deals with
technology.

Grub: It is an open source distributed search crawler platform.


Guruji.com: Founded by two graduates from the Indian Institute of Technology, Delhi, Guruji.com is an Indian Internet
search engine. It provides search results in the context of Indians.

Hakia: Riza Berkan, a nuclear scientist and Pentti Kouri, a venture capitalist from New York, developed this Internet
search engine.

Healia: It is a vertical search engine as also an online health community.

Home.co.uk: It operates in the United Kingdom and allows the users to search for properties and analyze house
prices.

HotBot: Previously, a search engine that offered free web page hosting for a short while is now a front-end for third
party search engines.

HotPads: It is a rental housing and real estate search engine. It offers graphical maps to the users to ease their process
of searching for property.

IceRocket: This Internet search engine is specialized in searching blogs.

IFACnet: It is an enterprise search engine that was launched in 2006 with an intent to provide the accountants all
around the world an access to articles, practice guidance and management tools. It was created by IFAC, the
International Federation of Accountants as a global and multilingual search engine for the accountants’ domain of users.

Indeed.com: It is a job search engine that offers vertical job search, thus giving a new approach to the searching for job
opportunities.

Info.com: It is one of the leading Internet search engines as also one of the pay-per-click directories.

IsoHunt: Founded in 2003 by Gary Fung, a Canadian, IsoHunt is an extensive BitTorrent index and a very popular
BitTorrent search engine.

Ixquick: It is a metasearch engine that has provided more than 120 million searches since 2004. David Bodnick
founded it in 1998. An interesting feature of this Internet search engine is that it uses a ‘star system’ to rank its results,
wherein it assigns a star to each result retrieved by it.

Jexamples: It is a source code engine for Java language indexing source codes from several Java open source
projects.

Kartoo: It is a metasearch engine with an excellent graphical user interface. It has the unique ability of refining searches
through a single mouse click.

Kayak.com: It is a travel search engine website of the USA that helps the users search for hotels, airlines and cruises.

Koders: Koders is a search engine for open source code that indexes numerous open source code repositories.

Krugle: It helps programmers and software developers to search for open source code repositories and facilitates quick
sharing of code.

Leit.is: It is an Icelandic Internet search engine that can search terms on specific websites, titles of websites and for
websites linking to a particular website. This Internet search engine is available in both Icelandic and English languages.

LexisNexis: It is actually an archive that can be searched for newspapers, magazines and legal documents. It is the
largest collection of public records, unpublished opinions and business information. Lawyers, students and journalists
form a major sector of the LexisNexis users.

Lexxe: This Internet search engine uses natural language processing to answer user queries. It was established by Dr.
Hong Liang Qiao, an Australian technology expert.

Live Search: It allows the viewing of additional search results on the same web page and adjusting the amount of
information accompanying each search result. It facilitates the saving of searches. Live Search is the fourth most
popular search engine.

Live Search Maps: It is a web mapping service that comes packaged with the application services suite of Microsoft
Windows Live.

Live Search QnA: It is a question and answer service operating on lines similar to Google Answers.
Lucene: It is an open source information retrieval library that is widely known for its use as a search engine.

Lycos iQ: It is a human search site and is driven by a community. Functionally it is similar to Google Answers.

Magellan: It is a search engine that was brought out by Excite, a popular Internet portal and a recognized brand on the
Internet.

MagPortal.com: It is a search engine and directory that can be used to find online articles.

Mahalo.com: It is a web directory and can be called as a human search engine as it involves human effort in the form
of editors writing search result pages.

Mamma.com: Herman Turnurcuoglu launched Mamma.com in 1996. In 2007, it was named as Copernic Inc.
Mamma.com is the first metasearch engine.

MapQuest: It is a map publisher and an online web mapping service.

MetaCrawler: It is a metasearch engine that combines results from most of the top search engines of the world. It also
offers searches on images, videos and news.

MetaLib: It is a federated search system that can simultaneously search a wide range of resources.

Miner.hu: It is a vertical search engine that searches for blogs and videos. It mainly handles the Hungarian content on
the Internet.

Mininova: It is a very large torrent listing site that serves as a directory as well as a search engine for torrent files.

mnoGoSearch: It is an open source search engine with the capacity of searching multilingual websites and its results
can be sorted by relevance. It is written in the C language.

Myriad Search: This metasearch engine founded by Aaron Wall combines search results from popular search engines
like Yahoo and Google.

Nadji.si: Interseek created this search engine and web portal. It is the most visited website in Slovenia.

Namazu: It is a free full-text search engine software written in Pearl that can work as a web search engine.

Naver: Naver is a search portal popular in South Korea. It was launched in 1999 and was the first portal to have its own
search engine. Reportedly, it ranks fifth in the list of the popular search engines in the world.

Newslookup.com: It is a news search engine, which is the first one allowing the users to search for news on the basis
of the source region of the news organization. It includes the news from most of the prominent media sites and offers
unbiased results to user queries.

NextBio: It is an interactive search engine that enables searches related to bioscience. It helps the researchers retrieve
information related to their inventions and discoveries.

Nicado: Nicado Limited developed this search engine that is based on searching old email and old telephone numbers.
Registered users are allowed to search the Nicado database by means of a telephone number or an email address.

Northern Light Group: It specializes in research portals, text analysis and enterprise search.

Nutch: It is an open source Internet search engine based on Lucene Java.

Omgili: It is a vertical search engine that bases its results on user-generated content like forums and discussion groups.
It crawls over millions of online discussions.

Onkosh: It is a web portal for the Arabic web and facilitates web search, image search, file search as well as the search
of blogs and forums. Onkosh also offers directory searches and displays Arabic news.

OpenFTS: It is a free software database search engine that is based on the PostgreSQL database.

Open Text 4: In 1994, the Open Text Corporation started hosting Open Text 4, a search engine, which competes with
AltaVista.

PeekYou: Founded by Michael Hussey in April 2006, PeekYou is one of the very popular people search engines that
has indexed over 150 million people from the United States and Canada.
Pharos: This is a European multimedia search engine that is being developed by Fast Search and Transfer, a
Norwegian company that was founded in 1997.

Picsearch: This is actually a Swedish company that provides image, audio and video search over large websites. It also
develops a video-sharing platform.

Pixsta: It is an image search engine designed by Dr. Daniel Heesch. It is not based merely on tags. It analyses images
to identify their attributes like shape, color and texture.

Podscope: It was the first consumer search engine, which created an index of spoken words.

Properazzi.com: It is an online real estate search engine that displays the properties to be rented and those for sale. It
crawls the property listings from about 58 countries from Europe.

PubGene: It is a public search engine that provides information related to health and medicine.

Rambler: It is one of the biggest web portals of Russia and a Russian search engine. Apart from web search it also
offers email facilities, media and e-commerce applications.

Recruit.net: It is a job search engine that is available in English, Japanese and Chinese versions, making it an
international search engine.

Rediff.com: It is a web portal offering news, information, and entertainment and shopping. It appears fifth in the list of
popular Indian portals.

Rollyo: It is a Yahoo-powered search engine that was founded by Dave Pell and designed by Angus Durocher and Dan
Cederholm. It allows its registered users to create search engines that retrieve results from a set of websites defined by
the user.

SAPO: Its full name is Servidor de Apontadores Portugueses and is a brand and subsidiary company of the Portugal
Telecom group. It is an Internet service provider that is a search engine since 1995.

Sciencenet: It is an experimental search engine that is based on peer-to-peer technology. It was developed by Michael
Christian at Karlsruhe Institute.

Search.ch: Rudolf Raber and Bernhard Seefeld established search.ch as a regional search engine. Today, it is the
second most popular search engine in Switzerland. In addition to search options, it also provides the users with
phonebook facilities.

Searchmedica: It is a series of medical search engines that is built by doctors for their own community.

SeeqPod: It indexes audio and video media that are playable over the Internet.

Sesam: It is a Scandinavian Internet search engine that is available in Norwegian as well as Swedish versions. In 2007,
it was among the largest websites of Norway.

SimplyHired.com: It is one of the major job search engines of the USA.

ShopWiki: Kevin P. Ryan, the former CEO of DoubleClick founded this Internet shopping search engine in 2005. It was
launched in 2006.

SideStep: It is a metasearch engine that combines the results from over 200 travel websites. It also searches hotels
and airlines worldwide.

Sogou: Launched in 2004, this Chinese search engine can search text, images and maps. It competes with Baidu.

Sohu: It is a China-based search engine company that offers a search engine along with other services like online
gaming.

Sphere: It is a blog search engine that organizes bloggers by topic and uses semantic matching to retrieve relevant
information.

Sphinx: It is a free software search engine that mainly indexes databases such as MySQL and PostgreSQL. It has high
searching and indexing speeds and supports distributed searching.

Spock: It is a website search engine that indexes people. 30% of its searches are people-related.
Spokeo: It is a social-network-based people search service. It is capable of searching a group of people at a time. It
should not be mistaken as being a social networking site.

SWISH-E: The name stands for Simple Web Indexing System for Humans - Enhanced. It indexes document collections.

Technocrati: This Internet search engine is used for searching blogs. It indexes as many as 112.8 million blogs and
about 250 million tagged writings.

TEK: The name is short for ‘Time Equals Knowledge’. It is an email-based search engine that requires the user to send
a query via an email to their server, which then performs the search on the user query. It was developed by the
Massachusetts Institute of Technology.

TheFind.com: It is known as a discovery shopping search engine and was launched in 2006. It uses an algorithm
based on relevance and popularity and focuses on lifestyle products.

Topix: The Founders of the Open Directory Project created this news aggregator. It also has a community news-editing
platform and a forum allowing users to comment on news articles.

Torrentz: It is a metasearch engine that combines the searches from several different torrent sites.

Trexy: It is a metasearch engine that allows the users to trace back their search activities by implementing the concept
of searchable search trails.

Turbo10.com: It is an Internet metasearch engine that can search 10 databases simultaneously and accesses
information from about 800 online databases.

Uclue: It is a research service that allows the users to post queries in the form of questions. Most of the members that
form the Uclue team are former researchers of Google Answers.

Velocity: This is a part of the search suite developed by Vivisimo, an enterprise search software company in Pittsburgh,
USA.

Vtap.com: This is a metasearch engine that has been developed by Veveo, Inc, a startup company in Massachusetts.

Walla!: It is an Israeli Internet news portal as also a search engine and an mail service provider. It provides news from
Israel and from the rest of the world and is one of the most popular websites in Israel.

Wazap!: It is a vertical search engine that indexes gaming sites. It uses an internal algorithm to rank the indexed pages
according to their relevance. Apart from being a search engine, it is also a social networking site and a video game
database.

WebCrawler: It is a metasearch engine that combines the search results of some of the very popular Internet search
engines like Google, Yahoo, Ask.com and others. It was the first search engine that could provide a full text search.

WebMD: It is a medical information service that provides health information and offers medical newsletter and answers
to medicine and disease related questions.

Wikia Search: The striking feature of this web search engine is that it is an open source search engine facilitating the
users with abilities to rate, modify and enhance the search results.

Wink: Wink is a community-based social search engine that offers the means to search people over different social
networks. It is different from other search engines in terms of its algorithms for retrieval of search results. It bases the
search results on user inputs.

YaCy: YaCy is a free and distributed search engine that is based on peer-to-peer technology.

Yahoo! Search: Starting as a web directory, it has evolved into a fully formed search interface. Standing in competition
with Google, Live Search and Ask.com, Yahoo has emerged as the second most popular search engine. Yahoo! News
is an Internet-based news aggregator that offers news of all the sectors including business, entertainment, science,
health and popular news. Yahoo! Maps is one of the very popular web mapping portals by Yahoo!. Yahoo! Answers is a
community-driven website that allows the users to post questions and find answers to them. Yahoo! HotJobs is an
online search engine providing tools and advice for job seekers and employers.

Yandex: It is a Russian search engine and the largest web portal of Russia. In terms of its number of users and the
breadth and depth of its search domain, it is the largest Russian search engine. Rambler is in close competition with
Yandex.
YouTube: It is a video sharing website that allows uploading and downloading of video files. As such, it attracts millions
of users in search of videos and is looked upon as a treasure of videos of the old and the new times. YouTube search
facilities return millions of videos and user channels.

YouTorrent: It is a BitTorrent search engine that can combine the searches from different torrent search engines.

Zabasearch.com: It searches for information related to the residents of the United States such as their names,
addresses and phone numbers.

Zettair: It is a compact search engine that mainly aims at searching text and indexes HTML collections.

Zoominfo: It is a vertical search engine that can create summaries of people and organizations it happens to crawl. It
offers results for people related searches.

Name of search engine Algorithm year Developer

Google Page Rank 1997 Labry page,sergy brin

Yahoo Rabking 1994 Jerry yang,david filo

Hakia Semantic 2004 Riza barkian

Sensebot “ “ 2005 Microsoft

Powerset “ “

Deepdyve “ “

Cognition “ “

Exalead “ “ 2000

Metacrawler Chritofides 1994 Erik selberg,prof. oren


tzionie

Bittorrent Choking 2005 bittorrent

Ask Dekker 1996 Garrett grvener,david


warthen

WEB DEVELOPMENT & INTERNET (LAB)


Experiment number:-…………………………………………………………………………….

Name of the experiment:-……………………………………………………………………….

Date:-…..……………………………………………………………………………………………

Submitted by:-

EHTESAMUDDIN KHAN

ROLL NO:- 05

COMPUT
ER ENGG.(FINAL YEAR)

Computer Engineering Section,University Polytechnic


Faculty of Engineering & Technology
Jamia Millia Islamia

You might also like