|
|
| Ranking |
|
| Listing: |
- Authoritative Sources
in a Hyperlinked Environment
HITs is a link-structure analysis algorithm
which ranks pages by "authorities" (pages
which have many incoming links and provide
the best source of information on a given
topic) and "hubs" (pages which have many outgoing
links and provide useful lists of possibly
relevant pages). Ranking is performed at query
time. [PDF format]
- The
PageRank Citation Ranking: Bringing Order
to the Web
First Stanford paper about PageRank. It is
a static ranking, performed at indexing time,
which interprets a link from page A to page
B as a vote, by page A, for page B. Web is
seen as a direct graph and votes recursively
propagate from nodes to nodes. Ranking is
performed at indexing time. Used by Google.
- Adaptive
On-Line Page Importance Computation
A good explanation about the convergence of
various algorithms. This paper also describes
an adaptive and on-line algorithm for computing
the page importance. It can be used for focus
crawling as well as for search engine's ranking.
- The
Clever Project - The CLEVER search engine
incorporates several algorithms that make
use of hyperlink structure for discovering
information on the Web. It is an extension
of Hits method.
- DiscoWeb:
Discovering Web Communities Via Link Analysis
This paper describes a prototype system, later
known as the Teoma Search Engine. It performs
a Link Analysis, loosely based on the Kleimberg
method, and computed at query time.
- The
EigenTrust Algorithm for Reputation Management
in P2P Networks
An eingenvalues algorithm for calculating
reputation in P2P networks and isolating malicious
peers. There is a relationship with PageRank
algorithm.
- Exploiting
the Block Structure of the Web for Computing
PageRank
A hierarchical approach for computing PageRank.
The local PageRanks of page for each host
are computed independently and then used to
compute the global PageRank of Web Graph.
- Extrapolation
Methods for Accelerating PageRank Computations
A paper about the computation of PageRank
using the standard Power Method and the new
Quadratic Extrapolation which computes the
principal eigenvector of the Markov matrix
representing the Web link graph with an increased
speed up of about 50-300%.
- Finding
Authorities and Hubs From Link Structures
on the World Wide Web
A survey on PageRank, Hits and SALSA. It also
describes two Bayesian statistical algorithms
for ranking of hyperlinked documents and the
concepts of monotonicity and locality, as
well as various concepts of distance and similarity
between ranking algorithms.
- Improved
Algorithms for Topic Distillation in Hyperlinked
Environments
Given a typical user query to find quality
documents related to the query topic. It uses
an Hits variation. [PS format]
- Improvement
of HITS-based Algorithms on Web Documents
It proposes a new weighted HITS-based method
that assigns appropriate weights to in-links
of root documents and combines content analysis
with HITS-based algorithms.
- Improvement
to Clever Algorithm
A Kleimberg's algorithm improvement. [PDF
format]
- The
Intelligent Surfer: Probabilistic Combination
of Link and Content Information in PageRank
This method uses query dependent importance
scores and a probabilistic approach to improve
upon PageRank. It pre-computes importance
scores offline for every possible text query.
[PDF format]
- Larry
Page Describes PageRank
Slides which introduces citation importance
ranking by Larry Page, Google's founder.
- Link
Analysis, Eigenvectors, and Stability
Do Hits and PageRank (and some variations)
give stable rankings under small perturbations
to the linkage patterns? [PS format]
- Link
Analysis: Hubs and Authorities on the World
Wide Web
A survey on Kleimberg's Hits. [PS format]
- Link
Analysis in Web Information Retrieval
Survey of query independent and query dependent
connectivity based ranking [PS Format]
- The
Missing Link - A Probabilistic Model of Document
Content and Hypertext Connectivity
This paper describes a joint probabilistic
model for modeling the contents and inter-connectivity
of document collections such as sets of web
pages or research paper archives. [PDF format]
- PageRank:
A Circuital Analysis
It shows some theoretical results for understanding
the distribution of the score in the Web according
to PageRank. Seven golden rules for building
good pages are presented. [PDF format]
- PageRank
as a Random Walk
A general framework for measuring the quality
of an index and providing the background on
the PageRank and Random Walks. Imagine a Web
surfer who wanders the Web. At each step,
he/she either jumps to a page on the Web chosen
uniformly at random, or follows a link chosen
from those on the current page.
- PageRank
Calculation Techniques
Describes efficient techniques for computing
PageRank.
- PageRank
Calculation with Lossy Encoding
Lossy encoding for large scale PageRank calculation.
- PageRank
Computation Methods
A poster paper by Stanford db group which
describes iterative methods for calculating
PageRank. [PDF format]
- PageRank Explained
Some useful tips and tricks about Google's
PageRank and other by content ranks. [PDF
format]
- PageRank,
HITS and a Unified Framework for Link Analysis
It generalizes and combines PageRank and Hits
into a unified framework. [PS format]
- PageRank
Uncovered
A complete and updated analysis of PageRank
with several calculation examples. [PDF format]
- PageRank
U.S. Patent 6,285,999
Lawrence Page's PageRank Patent.
- PageRank
Used to Characterize Web Structure
PageRank's values on the Web follow a power
law. An high in-degree of a node does not
imply high PageRank, and vice versa. [PDF
format]
- Probabilistic
Combination of Content and Links
It introduces a probabilistic model that integrates
link topology (used to identify important
pages), anchor text (used to augment the text
of cited pages), and activation (spread to
linked pages). Experiments are on MSN Directory.
[PDF format]
- SALSA:
The Stochastic Approach for Link-Structure
Analysis
A focused search algorithm (SALSA) based on
Markov chains. It starts with a query on a
broad topic, discards useless links, and then
weights the remaining terms. A stochastic
crawl is used to discover the authorities
on this topic. [PS format]
- Scaling
Personalized Web Search
Link Popularity algorithms biased according
to a user-specified set of given interesting
pages.
- The
Second Eigenvalue of the Google Matrix
A mathematical paper about the convergence
of methods used for solving the PageRank Matrix.
- A
Survey of Eigenvector Methods of Web Information
Retrieval
A Survey presenting HITS, PageRank and Salza.
Enphasis is on eigencvectors and eigenvalues
analysis.
- Survey
on Google?s PageRank
Information on the algorithm, how to increase
PageRank, what diminishes it and how to distribute
PageRank within a website.
- Topic
-Sensitive Page Rank
Integrates ODP data in PageRank calculation
for performing query time probabilistic ranking.
- Towards
Exploiting Link Evolution
It describes how to compute incrementally
PageRank when Web graph's link topology changes.
[PS format]
- Web
Page Scoring Systems for Horizontal and Vertical
Search
"Random Surfer" model extension. At each step
of traversal of the Web graph, the surfer
can jump to a random node or follow a hyperlink
or follow a back-link (a hyperlink in the
inverse direction) or stay in the same node.
- Web-Trec
9 and Link Popularity
About the using of Link Popularity in Web
Track 9 datasets. [PDF format]
- Web-Trec
8 and PageRank
About the using of PageRank in Web Track 8
"large" and "small" datasets. [PDF format]
- What
Can You Do with a Web in Your Pocket?
One of the first description of Stanford's
WebBase and Google's PageRank. [PS format]
- What
is this Page Known for? Computing Web Page
Reputations,
PageRank and Hub and Authority generalization
based on the topic of Web Pages. Definition
of a model where a surfer can move forward
(following an out-going link) and backward
(following an in-going link in the inverse
direction). [PS format]
- The
World’s Largest Matrix Computation
A concise description of PageRank seen from
a mathematical point of view.
|
|
|