Hybrid Tag Recommendation in Collaborative Tagging SystemsThis web site is an extension of work presented in a series of publications:
Learning in Efficient Tag Recommendation.
Abstract: The objective of a tag recommendation system is to propose a set of tags for a resource to ease the tagging process done manually by a user. Tag recommendation is an interesting and well defined research problem. However, while solving it, it is easy to forget about its practical implications. We discuss the practical aspects of tag recommendation and propose a system that successfully addresses the problem of learning in tag recommendation, without sacrificing efficiency. Learning is realized in two aspects: adaptation to newly added posts and parameter tuning. The content of each added post is used to update the resource and user profiles as well as associations between tags. Parameter tuning allows the system to automatically adjust the way tag sources (e.g., content related tags or user profile tags) are combined to match the characteristics of a specific collaborative tagging system. The evaluation on data from three collaborative tagging systems confirmed the importance of both learning methods. Finally, an architecture based on text indexing makes the system efficient enough to serve in real time collaborative tagging systems with number of posts counted in millions, given limited computing resources.
The paper can be found in the proceedings of the 4th ACM Conference on Recommender Systems (RecSys '10) or here.
Tag Sources for Recommendation in Collaborative Tagging Systems.
Abstract: Collaborative tagging systems are social data repositories, in which users manage resources using descriptive keywords (tags). An important element of collaborative tagging systems is the tag recommender, which proposes a set of tags to each newly posted resource. In this paper we discuss the potential role of three tag sources: resource content as well as resource and user profiles in the tag recommendation system. Our system compiles a set of resource specific tags, which includes tags related to the title and tags previously used to describe the same resource (resource profile). These tags are checked against user profile tags - a rich, but imprecise source of information about user interests. The result is a set of tags related both to the resource and user. Depending on the character of processed posts this set can be an extension of the common tag recommendation sources, namely resource title and resource profile. The system was submitted to ECML PKDD Discovery Challenge 2009 for "content-based" and "graph-based" recommendation tasks, in which it took the first and third place respectively.
The paper can be found in the proceedings of ECML/PKDD Discovery Challenge 2009 workshop or here.
Sample experiment (60 MB) - To show the efficiency and simplicity of use of the system I combined the system and sample dataset to create a ready-to-use package. Just download the file and unpack it any place on your hard drive. The package contains two scripts. The first script runs the full process which indexes the posts, trains the system parameters and recommends tags. The second script produces the evaluation results. The requirements are: Java Runtime Environment, 1 GB of operational memory and around 0.5 GB of hard drive space. Please refer to the readme file included in the package for more details.
Results - the average precision, recall and F-1 score for three datasets used to evaluate the tag recommendation system. The files contains scores calculated for the top k recommended tags (k in [1,10]). Each step of the recommendation has a corresponding result file.
Source code - The first complete version of the system source code is now available. It is distributed under GNU General Public License. I'm constantly working on the clarity of code and comments, so new versions of the code should be soon posted. Please contact me if you have any questions - the address is available on the main page.
Old version - This is the website of the system that was submitted to the ECML/PKDD Discovery Challenge 2009.
go to main page
Last update: 2010.10.06