Ranking Model Adaptation For
Domain-Specific Search
ABSTRACT
With the explosive emergence of vertical
search domains, applying the broad-based ranking model directly to different domains
is no longer desirable due to domain differences, while building a unique
ranking model for each domain is both laborious for labeling data and
time-consuming for training models. In this paper, we address these
difficulties by proposing a regularization based algorithm called ranking
adaptation SVM (RA-SVM), through which we can adapt an existing ranking model
to a new domain, so that the amount of labeled data and the training cost is
reduced while the performance is still guaranteed. Our algorithm only requires
the Prediction from the existing ranking models, rather than their internal
representations or the data from auxiliary domains. In addition, we assume that
documents similar in the domain-specific feature space should have consistent
rankings, and add some constraints to control the margin and slack variables of
RA-SVM adaptively. Finally, ranking adaptability measurement is proposed
to quantitatively estimate if an existing ranking model can be adapted to a new
domain. Experiments performed over Letor and two large scale datasets crawled
from a commercial search engine demonstrate the applicabilities of the proposed
ranking adaptation algorithms and the ranking adaptability measurement.
EXISTING
SYSTEM
The existing broad-based ranking model
provides a lot of common information in ranking documents only few training
samples are needed to be labeled in the new domain. From the probabilistic
perspective, the broad-based ranking model provides a prior knowledge, so that
only a small number of labeled samples are sufficient for the target domain
ranking model to achieve the same confidence. Hence, to reduce the cost for new
verticals, how to adapt the auxiliary ranking models to the new target domain
and make full use of their domain-specific features, turns into a pivotal
problem for building effective domain-specific ranking models.
PROPOSED
SYSTEM
Proposed System focus whether we can adapt
ranking models learned for the existing broad-based search or some verticals,
to a new domain, so that the amount of labeled data in the target domain is
reduced while the performance requirement is still guaranteed, how to adapt the
ranking model effectively and efficiently and how to utilize domain-specific
features to further boost the model adaptation. The first problem is solved by
the proposed rank-ing adaptability measure, which quantitatively
estimates whether an existing ranking model can be adapted to the new domain,
and predicts the potential performance for the adaptation. We address the
second problem from the regularization framework and a ranking adaptation SVM
algorithm is proposed. Our algorithm is a blackbox ranking model adaptation,
which needs only the predictions from the existing ranking model, rather than
the internal representation of the model itself or the data from the auxiliary
domains. With the black-box adaptation property, we achieved not only the
flexibility but also the efficiency. To resolve the third problem, we assume
that documents similar in their domain specific feature space should have
consistent rankings.
Advantage:
1. Model
adaptation.
2. Reducing
the labeling cost.
3. Reducing
the computational cost.
MODULE
DESCRIPTION:
Number
of Modules
After careful analysis the system has been
identified to have the following modules:
1.
Ranking
Adaptation Module.
2.
Explore Ranking
adaptability Module.
3. Ranking adaptation with domain
specific search Module.
4.
Ranking Support Vector Machine Module.
1.Ranking adaptation Module:
Ranking adaptation is closely related to
classifier adaptation, which has shown its effectiveness for many learning
problems. Ranking adaptation is comparatively more challenging. Unlike classifier
adaptation, which mainly deals with binary targets, ranking adaptation desires
to adapt the model which is used to predict the rankings for a collection of
domains. In ranking the relevance levels between different domains are sometimes
different and need to be aligned. we can adapt ranking models learned for the
existing broad-based search or some verticals, to a new domain, so that the
amount of labeled data in the target domain is reduced while the performance requirement
is still guaranteed and how to adapt the ranking model effectively and efficiently
.Then how to utilize domain-specific features to further boost the model
adaptation.
2.Explore Ranking adaptability Module:
Ranking adaptability measurement
by investigating the correlation between two ranking lists of a labeled query in
the target domain, i.e., the one predicted by fa and the ground-truth one
labeled by human judges. Intuitively, if the two ranking lists have high
positive correlation, the auxiliary ranking model fa is coincided with the distribution
of the corresponding labeled data, therefore we can believe that it possesses
high ranking adaptability towards the target domain, and vice versa. This is because
the labeled queries are actually randomly sampled from the target domain for
the model adaptation, and can reflect the distribution of the data in the
target domain.
3.Ranking adaptation with domain
specific search Module:
Data from different domains are also characterized
by some domain-specific features, e.g., when we adopt the ranking model learned
from the Web page search domain to the image search domain, the image content
can provide additional information to facilitate the text based ranking model
adaptation. In this section, we discuss how to utilize these domain-specific features,
which are usually difficult to translate to textual representations directly,
to further boost the performance of the proposed RA-SVM. The basic idea of our
method is to assume that documents with similar domain-specific features should
be
assigned with similar ranking predictions.
We name the above assumption as the consistency assumption, which implies that
a robust textual ranking function should perform relevance prediction that is
consistent to the domain-specific features.
4.Ranking Support Vector Machines
Module:
Ranking Support Vector Machines (Ranking
SVM), which is one of the most effective learning to rank algorithms, and is here
employed as the basis of our proposed algorithm. the proposed RA-SVM does not need
the labeled training samples from the auxiliary domain, but only its ranking
model fa. Such a method is more advantageous than data based adaptation,
because the training data from auxiliary domain may be missing or unavailable,
for the copyright protection or privacy issue, but the ranking model is
comparatively easier to obtain and access.
SOFTWARE
REQUIREMENTS:
Operating System : Windows
Technology : Java and J2EE
Web Technologies : Html, JavaScript, CSS
IDE : My Eclipse
Web Server : Tomcat
Tool kit : Android Phone
Database : My SQL
Java Version :
J2SDK1.5
HARDWARE
REQUIREMENTS:
Hardware : Pentium
Speed : 1.1 GHz
RAM : 1GB
Hard Disk : 20 GB
Floppy Drive : 1.44 MB
Key Board : Standard Windows Keyboard
Mouse : Two or Three Button Mouse
Monitor : SVGA
0 comments: