Monday 20 August 2012

A social inverted index for social-tagging-based information retrieval

an article by Kang-Pyo Lee, Hong-Gee Kim and Hyoung-Joo Kim (Seoul National University, South Korea) published in Journal of Information Science Volume 38 Number 4 (August 2012)

Abstract

Keywords have played an important role not only for searchers who formulate a query, but also for search engines that index documents and evaluate the query.

Recently, tags chosen by users to annotate web resources are gaining significance for improving information retrieval (IR) tasks, in that they can act as meaningful keywords bridging the gap between humans and machines.

One critical aspect of tagging (besides the tag and the resource) is the user (or tagger); there exists a ternary relationship among the tag, resource, and user. The traditional inverted index, however, does not consider the user aspect, and is based on the binary relationship between term and document.

In this paper we propose a social inverted index – a novel inverted index extended for social-tagging-based IR – that maintains a separate user sublist for each resource in a resource-posting list to contain each user’s various features as weights.

The social inverted index is different from the normal inverted index in that it regards each user as a unique person, rather than simply count the number of users, and highlights the value of a user who has participated in tagging. This extended structure facilitates the use of dynamic resource weights, which are expected to be more meaningful than simple user-frequency-based weights.

It also allows a flexible response to the conditional queries that are increasingly required in tag-based IR. Our experiments have shown that this user-considering indexing performs better in IR tasks than a normal inverted index with no user sublists.

The time and space overhead required for index construction and maintenance was also acceptable.


No comments: