Towards Social-based User Modeling and Personalization

Michal Barla

Dissertation thesis project supervised by prof. Mária Bieliková

Motivation and Goals

There is no doubt that the Internet and its most-popular service World Wide Web have changed our everyday lives. Especially the Web has become an ubiquitous source of information, a communication tool, shopping mall and much more. Along with its size and importance rises a requirement to use it more effectively.

If we want to be able to provide a user with a personalized experience within our web site, we need to base our personalization on some type of user profile, an instance of a user model representing user characteristics and context. The research within the area of user modeling is very well established: the first user modeling workshop took place in Germany in 1986 and evolved into a series of highly credited annual international conferences focused mainly on user modeling for adaptive web-based systems. Despite the old tradition, the research area is still active, with many open challenges, which arise along with development of the Web itself.

Recently, the Web as well as its research got fascinated by a social phe- nomenon. Not only that people are more present on the Web, as they got the tools such as wikis, blogs, social tagging systems or social applications, which allow them to actively participate on the creation of web content, but we discovered the power of masses, wisdom of crowds, which can be used to organize the web content and to provide personalized recommendations using social relationships and group membership. The key are communities, either having real-life back-end or pure virtual ad-hoc communities of people sharing a common property at the given time. While usage driven adapta- tion, mainly in form or statistics-based recommendations is becoming very popular way of navigation, especially in news portals (e.g., simple recom- mendation of the most visited content within a time window), other forms of social recommendations, which would take into account various aspects of a personalized user community are still rather seldom. The challenge is how to represent, build and use a user model for a dynamically changing, often user-generated content.

In our work, we present a contribution in the field of user modeling for adaptive web-based systems, focusing on a shift from static, closed-corpus application domain into dynamic, vast and thus open-corpus application domains such as the Web is. We also target on enrichment of the user model as well as the user modeling process by social-related information.

This can be reduced into two major goals:

  • proposal of novel approaches to collection of data about user, which are independent of the chosen domain, separate thoroughly the data collection from the subsequent inference of higher level characteristics and are at the same time unobtrusive from the user’s perspective.
  • proposal of novel approaches to user model representation and inference, which respect the openness and dynamics of the information space. We aim at investigating the potential of navigational patterns in the dynamic information spaces as the means of navigation are rather constant even if the content is changing dynamically. We concern to explore such representations of domain and user models, which would allow for automatic and unobtrusive acquisition of these models in the vast and dynamic environment and provide a suitable basis for subsequent personalization of information space exploration.


In this thesis, we proposed methods which bring novel approaches to user modeling process. We focused on an open corpus user modeling, an approach which handles vast and dynamic information spaces such as the web is. Our main goals were to alleviate incorporation of the user modeling process into legacy web-based systems by extending a standard presentation layer by a user modeling sub-layer. Next, we were looking for solutions, which would bring the personalized experience to the ordinary everyday browsing on the Web by ubiquitous user and community modeling.

More specifically, we proposed and evaluated

  • a method for comprehensive logging of user activity on the Web with preserved semantics. We defined principles of turning the presentation layer of a web-based system into an adaptive one, with a clearly separated and flexible logging subsystem.
  • a method for capturing logs of “wild” Web surfing based on a specialized proxy sever. Our method produces logs enhanced by implicit feedback indicators suitable for processing into a "lightweight" term-based user model.
  • a method for open corpus user model inference based on rules expressing navigational patterns. The method shows how user model can be build and maintained regardless of the chosen application domain, considering only user’s navigation.
  • a method for term-based open corpus user modeling, which can be applied to capture user’s interest across the third-party web-sites and web-based systems. The method was presented at international User Modeling, Adaptation and Personalization conference, UMAP 2010 (Kramár et al., 2010), where we showed how a term-based user model with implicit interest indicators can server as basis for virtual communities detection. We showed a potential of such virtual communities to deliver efficient personalization while surfing the “wild” Web on a use case of a personalized search engine.
  • a method for finding relations between terms, which supports our term-based user modeling approach, based on collective wisdom en- coded in folksonomies. Our method uses a set theory approach applied on a folksonomy to reveal hierarchical relationships between terms.


The proposed methods represent a contribution in all parts of the user modeling process. Logging of user activity with preserved semantics combined with rule-based user model inference allows for straightforward transformation of user’s clickstream within the web-based system into a set of user model updates. This chaining of methods can be used effectively for both user model maintenance and initialization, to reduce cold-start problem.

We believe that social-based personalization, where we rely more on behavior of visitor and less on complete understanding of the content is the key to finally deliver ubiquitous personalized experience to every website. However, we need to “understand” the content to the extent that allows us to cluster visitors accessing the content into virtual communities so that they can mutually benefit from their community wisdom. This is exactly what we were aiming at by proposing the methods grouped around term-based user modeling, plugged in our specialized proxy server. They contribute to ubiquitous user modeling and open up various possibilities for community-based personalization.

The importance of new methods for open corpus personalization is even more visible, if we realize that traditional closed corpus web-based systems are being replaced by the systems driven by Web 2.0 principles. The most important one from our point of view is the support for user-generated content, which eliminates the difference between the roles of a Web producer and a Web consumer. Majority of web-based systems are thus becoming open-corpus and cannot directly apply traditional methods and techniques for closed corpus user modeling.

Selected publications

Barla, M., Tvarožek, M., Bieliková, M.
Rule-Based User Characteristics Acquisition from Logs With Semantics for Personalized Web-based Systems. In Computing and Informatics, Vol. 28, No. 4, 2009, pp.399–427.
Barla, M., Bieliková, M
On Deriving Tagsonomies: Keyword Relations Coming from Crowd. In Nguyen, N.T., Kowalczyk, R. (Eds.): Computational Collective Intelligence: Semantic Web, Social Networks and Multiagent Systems, ICCCI 2009, LNCS 5796, Springer, 2009, pp. 309–320.
Kramár, T., Barla, M., Bieliková, M.
Disambiguating Search by Leveraging a Social Context Based on the Stream of User’s Activity. In De Bra, P., Kobsa, A., Chin, D. (Eds.):User Modeling, Adaptation, and Personalization, UMAP 2010, LNCS 6075, Springer, 2010, pp. 387–392.
Barla, M.,Bieliková, M.
Ordinary Web Pages as a Source for Metadata Acquisition for Open Corpus User Modeling. In White,B., Isaías, P., Andone, D., (Eds.): WWW/Internet 2010, IADIS Press, 2010, pp. 227–233
Tvarožek, M., Barla, M., Bieliková, M.
Personalized Presentation in Web-Based Information Systems. In J, van Leeuwen, et al. (Eds.): SOFSEM 2007 , LNCS 4362, Springer, 2007, pp. 796–807.
Tvarožek, M., Barla, M., Frivolt, G., Tomša, M., Bieliková, M.
Improving Search in the Semantic Web via Integrated Personalized Faceted and Visual Navigation.. In: Geffert, V. et al. (Eds.): SOFSEM 2008 , LNCS 4910, Springer, 2008, pp. 778–789.

to Homepage to Teaching to the Top

Last updated:
Mária Bieliková bielik [zavináč] fiit-dot-stuba-dot-sk
Design © 2oo1 KoXo