Yahoo! has announced a Learning to Rank challenge as part of the Learning to Rank Workshop at ICML 2010.
They are releasing (to participants) two large real-world datasets. The first dataset has:
29,921 queries
744,692 URLs
519 features
The second datasets consists of
6,330 queries
172870 URLs
596 features
There are total 700 different features
There are two tracks
1) Standard LTR track
2)Transfer-learning track
The queries, urls and features descriptions are not disclosed, only the feature values.
Learning to Rank task usually requires large datasets, the more data better chances of making a decision. That is one of the main reasons of Google's success as it gets large hits. The ability to account for all features could be a challenge. The submissions are already on and more than 150 submissions have been made. Multiple submissions are possible.
The format for each of the files is the same as the one used in SVMLight
Relevant papers could be found at:
1)ICML has provided link to 100 papers although I am not able to access them yet
2)CIKM 2008 papers and its lectures
3)AIR WEB papers
No comments:
Post a Comment