Current web portals contain large amount of information targeted at many groups of users. However, all users are presented the same information. Research shows that users access web pages mostly following hyperlinks (more than 50 % of all means of accessing web pages). Therefore we could help users by personalizing the navigation of a web portal and recommending them links in which they have potential interest.
The method of automatic interest estimation is based on tracking the actions done by users on each web page of a given web portal. From the actions we determine user's interest in that web page. To do this we compare the values of time spent on a web page and number of scrolling events with values from other people who visited the same page. If the value for current user is more than X % higher than the average we consider it as a sign of positive interest in the page. On the other hand, when it is more than X % lower than the average we consider it as a sign of negative interest. When the value is around average (± X %) it is a sign of neutral interest. We also track if the user copied some text into a clipboard, which we consider to be a sign of positive interest.
When recommending links we predict user's interest in yet unseen web pages using collaborative filtering. The links with the highest predicted interest are then recommended to the user. For interest prediction we use Resnick's formula. As a similarity measure between two users we use Pearson's correlation coefficient.
Apart from behavior on a single web page the users behave in a certain way when considering the whole web portal. We try to find patterns in ordered sequences of web pages visited in one session (or links which the user followed). These patterns could be:
In our opinion every user has a certain way of navigating through a particular web portal. We believe that in a closed web for each user there will be one dominant pattern. If so, we can organize users into groups based on their dominant pattern. We can then recommend links among the members of each group.
Implementation of prototype
The prototype is composed of three independent components sharing the same database. SpyImp is a component which regularly crawls selected web portal and analyzes it. This way we can extract interesting information from web pages which we then recommend, like events or news. AdaptiveImp is another component, which is responsible for grouping of users, computing of the predicted interest and forming recommendations for each user. WebImp is a plug-in to the adaptive proxy server. Its only job is to personalize the web page before it is shown to the user. It loads the recommendations for a particular user and inserts it to the proper section of the web page.
To evaluate our method we did experiments on the web portal of our faculty. Because there are many events taking part during the year we first analyzed every web page and extracted events from it. We then tracked users' actions on every page, computed their interest and associated it with every event found on that web page. We constructed a personalized calendar of interesting events for each user and put it into the web page. We also recommended news, which are in our notion web pages that have changed since the last visit. The detection of changes was done during the web portal's analysis.
We monitored actions of 24 users on a modified website for 3 weeks. We compared our calculations with their explicit feedback. Results indicate that time actively spent on a web page is the best interest indicator. Scrolling proved to indicate positive interest as well. However, when the user does not use scrolling, it does not always mean he is not interested in the page. The accuracy of our interest estimation method was 62 %. The sections with recommended links – especially calendar – were attractive (according to answers from questionnaire) and the users found 55 % of recommended links and events interesting. We also tried to evaluate the detection of navigational patterns. We were able to find all of them and it showed up that some users did have their dominant pattern. However, because of small amount of users in our experiments we decided not to consider these patterns. Therefore we did not divide users into groups for recommendations.
The following figure shows a part of the modified web page including personal calendar, personal news and additional links sections.