Motivation Everyone sometimes encounters a problem, that while studying or just reading some text, reader gets to the point, where he finds out that he needs more information or requires the text to be better explained. Such a place could be a term, that he does not understand and therefore needs some kind of explanation. It may be also a concept, about which reader would like to find out something more than just information mentioned directly in the text. These concepts are usually matching with a particular word in the text, so reader needs additional information to be directly assigned to this word. We propose a method to automatically enrich keywords in text with definitions or links to related pages. This method is proposed for web pages in Slovak, but can be used for more languages with similar structure. Method Method for annotation creation consists of four steps:
Candidate words to assign annotations are extracted from web page text translated to English. To map extracted words to words in original text we proposed and verified a method for word mapping between text and its translation. To search for information to fill the annotations we use various publicly available services for information retrieval, such as Google Search or SlideShare. We proposed method for adaptation of annotations in form of reordering of list of links to related web pages. Presented order of links is based on implicit feedback from user's interaction with annotations. Evaluation We evaluated proposed methods in multiple closed experiments and in open experiment in learning system ALEF. In proposed method for mapping equivalent words between text and its translation, more than 90% of mappings were correct, but only 45% of all words in text were mapped. To increase the number of mapped words we implemented two enhancements of proposed method. First enhancement take into account positions of words in phrases. The second use dictionary transformed in the way, that all English words are stemmed using Porter algorithm. We evaluated quality of information gathered through publicly available services and we found that there are big differences in quality of information provided by different services. The quality of found information and thus quality of created annotations greatly depends on used services. In learning system ALEF, we evaluated the method for annotation adaptation. We found that created oder of links is always better than the random order, but the order wasn't perfect. We believe that with more training data this order could be even better. Publications
|
|||||||||||||||||||||||||||||||||||||||||||||||
|
|