Monday, October 5, 2009

Pagerank-The Multibillion Idea

Pagerank was the idea developed by Larry Page,thus the name.. accompanied by Sergey Brin --the owners of Google. They got this idea while doing a college project in Stanford. Now what was the idea that got them such large money?
Pagerank determines the quality of any page. Based on "Random Surfer Model", Pagerank is a probablity distribution used to represent likelyhood that a person clicking on links will arrive at a particular page.
Before Pagerank was introduced the method used for returning queries by search engines was vertor space model,which was quite good for content matching. But need for good pages to be returned as a result to query was felt. Then link matching was introduced in pagerank . Now we all know what hyperlink is... it connects one page to another.
Its a good way to understand pagerank with the idea that web is directed graph, where a site linking to other sites shows a directed link from it to the page being linked. The pages are like nodes and hyperlinks are the links of the directed graph.
We can, for now, say that , more the link pointed to a page by hyperlinks of other pages.. better it is .The links given to a page are called backlinks to that page. Accepting this we say that hyperlinks are like votes in democracy ,if you vote someone (by linking your page to his page) ,you actually increase his page's quality. But unlike democracy , you can vote more than once condition being more you vote.. fewer is the value of your vote.
So we say
PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))
where, d is termed to be the damping factor. ,
,PR(T1) represents pagerank of the page T1,
,C(T1) represents the number of outgoing links for the page T1
,d:damping factor is used to stop having to much influence ,this total vote is damped down to a value by d. the damping factor is usually taken to be 0.85

No comments:

Post a Comment