A Genetic Programming Approach for Learning Semantic Information Extraction Rules from News

2014 IJntema, W., Hogenboom, F.P., Frasincar, F., and Vandic, D.
15th International Conference on Web Information System Engineering (WISE 2014) (418-432)

Due to the increasing amount of data provided by news sources and the user specific information needs, recently, many news personalization systems have been proposed. Often, these systems process news data automatically into information, while relying on underlying knowledge bases, containing concepts and their relations for specific domains. For this, information extraction rules are frequently used, yet they are usually manually constructed. As it is difficult to efficiently maintain a balance between precision and recall, while using a manual approach, we present a genetic programming-based approach for automatically learning semantic information extraction rules from (financial) news that extract events. Our evaluation results show that compared to information extraction rules constructed by expert users, our rules yield a 27% higher F1-measure after the same amount of rules construction time.

Download the paper

A Lexico-Semantic Pattern Language for Learning Ontology Instances from Text

2012 IJntema, W., Sangers, J., Hogenboom, F.P. & Frasincar, F.
Journal of Web Semantics (37-50)

The Semantic Web aims to extend the World Wide Web with a layer of semantic information, so that it is understandable not only by humans, but also by computers. At its core, the Semantic Web consists of ontologies that describe the meaning of concepts in a certain domain or across domains. The domain ontologies are mostly created and maintained by domain experts using manual, time-intensive processes. In this paper, we propose a rule-based method for learning ontology instances from text that helps domain experts with the ontology population process. In this method we define a lexico-semantic pattern language that, in addition to the lexical and syntactical information present in lexico-syntactic rules, also makes use of semantic information. We show that the lexico-semantic patterns are superior to lexico-syntactic patterns with respect to efficiency and effectivity. When applied to event relation recognition in text-based news items in the domains of finance and politics using Hermes, an ontology-driven news personalization service, our approach has a precision and recall of approximately 80% and 70%, respectively.

Download the paper

A Semantic Approach for News Recommendation

2011 Frasincar, F., IJntema, W., Goossen F. & Hogenboom F.P.
Business Intelligence Applications and the Web: Models, Systems and Technologies (102-121)

News items play an increasingly important role in the current business decision processes. Due to the large amount of news published every day it is difficult to find the new items of one’s interest. One solution to this problem is based on employing recommender systems. Traditionally, these recommenders use term extraction methods like TF-IDF combined with the cosine similarity measure. In this chapter, we explore semantic approaches for recommending news items by employing several semantic similarity measures. We have used existing semantic similarities as well as proposed new solutions for computing semantic similarities. Both traditional and semantic recommender approaches, some new, have been implemented in Athena, an extension of the Hermes news personalization framework. Based on the performed evaluation, we conclude that semantic recommender systems in general outperform traditional recommenders systems with respect to accuracy, precision, and recall, and that the new semantic recommenders have a better F-measure than existing semantic recommenders.

Download the paper

Ontology-Based News Recommendation

2010 IJntema, W., Goossen, F., Frasincar, F. & Hogenboom, F.P.
International Workshop on Business Intelligence and the WEB (BEWEB 2010) (1-6)

Recommending news items is traditionally done by termbased algorithms like TF-IDF. This paper concentrates on the benefits of recommending news items using a domain ontology instead of using a term-based approach. For this purpose, we propose Athena, which is an extension to the existing Hermes framework. Athena employs a user profile to store terms or concepts found in news items browsed by the user. Based on this information, the framework uses a traditional method based on TF-IDF, and several ontologybased methods to recommend new articles to the user. The paper concludes with the evaluation of the different methods, which shows that the new ontology-based method that we propose in this paper performs better (w.r.t. accuracy, precision, and recall) than the traditional method and, with the exception of one measure (recall), also better than the other considered ontology-based approaches.

Download the paper