Post/Upload



  • The Conference Participant Advisor Service in a Virtual Information and Knowledge Environment Framework
    Marco Degemmis, Pasquale Lops, Pierpaolo Basile, Giovanni Semeraro, ESWC 2006 Workshop on Mastering the Gap: From Information Extraction to Semantic Representation. ABSTRACT: Algorithms designed to support users in retrieving relevant information base their relevance computations on user profiles, in which representations of the users interests are maintained. The idea proposed in this paper is the integration of general linguistic knowledge in the process of learning semantic user profiles able to represent users’ interests in a more effective way with respect to classical keyword-based profiles. Semantic profiles are obtained by integrating a na¨ýve Bayes approach for text categorization with a word sense disambiguation strategy based exclusively on the lexical knowledge stored in the WordNet database. Semantic profiles are exploited by the “conference participant advisor” service developed in the VIKEF (Virtual Information and Knowledge Environment Framework) project in order to suggest papers to be read and talks to be attended by a conference participant. Experiments carried out on a dataset made of papers accepted to the previous editions of the International Semantic Web Conference and rated by real users show the effectiveness of the service.

    File: ESWC2006_degemmis.pdf
    Filesize:281kb
    Uploaded:03-07-2007, 17:23

  • An Intelligent Personalized Service for Conference ParticipantS
    Marco Degemmis, Pasquale Lops and Pierpaolo Basile. Submitted to the 16th International Symposium on methodologies for Intelligent Systems, held in Bari, Italy, on Sep 27th-29th, 2006. ABSTRACT: This paper presents the integration of linguistic knowledge in learning semantic user profiles able to represent user interests in a more effective way with respect to classical keyword-based profiles. Semantic profiles are obtained by integrating a na¨ýve Bayes approach for text categorization with a word sense disambiguation (WSD) strategy based on the WordNet lexical database (Section 2). Semantic profiles are exploited by the “conference participant advisor” service in order to suggest papers to be read and talks to be attended by a conference participant. Experiments on a real dataset show the effectiveness of the service.

    File: ISMIS06_degemmis.pdf
    Filesize:151.6kb
    Uploaded:03-07-2007, 17:16

  • RDF and Contexts: Use of SPARQL and Named Graphs to Achieve Contextualization
    Heiko Stoermer, Ignazio Palmisano, Domenico Redavid, Luigi Iannone, Paolo Bouquet, Giovanni Semeraro. Submitted to the First Jena User Conference, to be held in Bristol, UK, on May 10th and 11th, 2006. ABSTRACT: The very simple structure of the RDF data model and semantics can lead to a number of issues when more complex scenarios are supposed to be represented in RDF. Aspects such as temporary evolution of a knowledge base, relevance and trust require more than just a model that consists of a set of universally true statements, without any reference to a situation, a point in time, or generally a context. Our proposed solution is to use the notion of context to separate statements that refer to different contextual information, which could so far not explicitly be tied to the statements because of the simplicity of RDF. In this paper we describe a practical solution to this problem, which has been implemented in the VIKEF project.

    File: Jena_heiko.pdf
    Filesize:175.3kb
    Uploaded:03-07-2007, 17:12

  • Updated Flyer- Making the Semantic Web Fly
    VIKEF provides semantic web solutions and technologies to stakeholders who can foster its future usage and adequate exploitation

    File: Flyer-updated.pdf
    Filesize:708.3kb
    Uploaded:03-07-2007, 13:59

  • VTC-VIKEF Technology Catalogue
    PURPOSE: VIKEF is a large application-oriented project in the area of Semantic Web and knowledge technologies partly funded by the European commission. The project combines competences from the areas of linguistic and multimedia extraction, ontology engineering and ontology learning, semantic representation, integration, and interoperability, personalization support, and aims at building a framework, which supports a complete knowledge supply chain. A knowledge supply chain is a process which starts from raw content (e.g. scientific and business documents, images, etc.); covers the extraction and representation of the semantic information implicitly conveyed in these content objects; and enables the construction of intelligent services which exploit this semantic information for an improved support of user tasks in an automated and customizable way. To build the VIKEF framework, a number of innovative components and technologies have been developed and integrated. It is the purpose of this VIKEF Technology Catalogue (VTC) to collect and describe these technologies, and to raise awareness for the solutions developed in VIKEF. The technology catalogue also points out possible application sectors for each of the technologies, in order to foster application of the VIKEF technologies outside the VIKEF project.

    File: VTC.pdf
    Filesize:2749.9kb
    Uploaded:03-07-2007, 13:26

  • Poster: Ontology Learning in Multimedia Information Extraction for Product Catalogues. Boemie 2006.
    Roberto Bartolini, Emiliano Giovannetti, Simone Marchi, Simonetta Montemagni, Claudio Andreatta, Roberto Brunelli, Rodolfo Stecher, Claudia Niederée, Paolo Bouquet and Stefano Bortoli. IDEA: A methodology developed in VIKEF project. This is methodology for extracting multimedia information from product catalogues empowered by the synergetic use and extension of a domain ontology. The use of domain ontologies in this context additionally opens up innovative ways of catalogue use. The method is characterized by incrementally feeding and exploiting the ontology during an information extraction process, implemented by the semantic annotation of the analysed document, and by providing support for detecting existing similar ontologies to enable reuse of (parts of) them.

    File: Poster_BOEMIE_2006.pdf
    Filesize:345.5kb
    Uploaded:03-07-2007, 13:19

  • Ontology Learning in Multimedia Information Extraction from Product Catalogues
    Roberto Bartolini, Emiliano Giovannetti, Simone Marchi, Simonetta Montemagni, Claudio Andreatta, Roberto Brunelli, Rodolfo Stecher, Claudia Niederée, Paolo Bouquet and Stefano Bortoli. ABSTRACT: We propose a methodology for extracting multimedia information from product catalogues empowered by the synergetic use and extension of a domain ontology. The use of domain ontologies in this context additionally opens up innovative ways of catalogue use. The method is characterized by incrementally feeding and exploiting the ontology during an information extraction process, implemented by the semantic annotation of the analysed document, and by providing support for detecting existing similar ontologies to enable reuse of (parts of) them.

    File: Ontology_Learning_Multimedia_Information_Extraction.pdf
    Filesize:28kb
    Uploaded:03-07-2007, 13:09

  • Enabling a Knowledge Supply Chain: From Content Resources to Ontologies
    Rodolfo Stecher, Claudia Niederee, Paolo Bouquet, Thierry Jacquin, Salah Aýt-Mokhtar, Simonetta Montemagni, Roberto Brunelli, and George Demetriou. ABSTRACT: Semantic annotation of content is a crucial building block of making the Semantic Web fly. The (semi-)automatic support of the underlying semantic knowledge supply chain requires contributions from different research disciplines and well-defined pipelines, which step-by-step create such annotations from raw content objects. This paper presents an annotation pipeline that has been designed and implemented as part of the VIKEF project. A clear structuring of the pipeline, the selection of adequate representation formats for the intermediate results (products) as well as for configuration information have been identified as crucial ingredients for an annotation pipeline, that enables the application-specific customization of the pipeline components and the flexible integration of upcoming advanced methods like new extraction methods into the pipeline.

    File: paperMTG_ESWC2006.pdf
    Filesize:358.8kb
    Uploaded:03-07-2007, 12:34

  • Contextualization of a RDF Knowledge Base in the VIKEF Project
    Heiko Stoermer, Ignazio Palmisano, Domenico Redavid, Luigi Iannone, Paolo Bouquet, and Giovanni Semeraro ABSTRACT: Due to the simplicity of RDF data model and semantics, complex application scenarios in which RDF is used to represent the application data model raise important design issues. Modelling e.g. the temporary evolution, relevance, trust and provenance in Knowledge Bases require more than just a set of universally true statements, without any reference to a situation, a point in time, or generally a context. Our proposed solution is to use the notion of context to separate statements that refer to different contextual information, which could so far not explicitly be tied to the statements. In this paper we describe a practical solution to this problem, which has been implemented in the VIKEF project, which deals with making explicit and intelligently useable information contained in vast collections of documents, databases and metadata repositories.

    File: ICADL_2006_paper_249.pdf
    Filesize:373kb
    Uploaded:03-07-2007, 12:32

  • Matching Hierarchical Classifications with Attributes
    L. Sera, S. Zanobini, S. Sceffer, and P. Bouquet ABSTRACT: Hierarchical Classifications with Attributes are tree-like structures used for organizing/classifying data. Due to the exponential growth and distribution of information across the network, and to the fact that such information is usually clustered by means of this kind of structures, we assist nowadays to an increasing interest in ending techniques to define mappings among such structures. In this paper, we propose a new algorithm for discovering mappings across hierarchical classifications, which faces the matching problem as a problem of deducing relations between sets of logical terms representing the meaning of hierarchical classification nodes.

    File: ESWC06-CtxMatch2.pdf
    Filesize:164.9kb
    Uploaded:03-07-2007, 12:24

  • Soundness of Schema Matching Methods
    M. Benerecetti, P. Bouquet, S. Zanobini Abstract. One of the key challenges in the development of open semantic-based systems is enabling the exchange of meaningful information across applications which may use autonomously developed schemata. One of the typical solutions for that problem is the definition of a mapping between pairs of schemas, namely a set of point–to–point relations between the elements of different schemas. A lot of (semi-)automatic methods for generating such mappings have been proposed. In this paper we provide a preliminary investigation on the notion of correctness for schema matching methods. In particular we define different notions of soundness, strictly depending on what dimension (syntactic, semantic, pragmatic) of the language the mappings are defined on. Finally, we discuss some preliminary conditions under which a two different notions of soundness (semantic and pragmatic) can be related.

    File: benerecetti-et-al.pdf
    Filesize:119.5kb
    Uploaded:03-07-2007, 12:03

  • Bootstrapping Semantics on the Web: Meaning Elicitation from Schemas
    Paper: Paolo Bouquet, Luciano Serafini, Stefano Zanobini. ABSTRACT: In most web sites, web-based applications (such as web Portals, emarketplaces, search engines), and in the file systems of personal computers, a wide variety of schemas (such as taxonomies, directory trees, thesauri, Entity-Relationship schemas, RDF Schemas) are published which (i) convey a clear meaning to humans (e.g. help in the navigation of large collections of documents), but (ii) convey only a small fraction (if any) of their meaning to machines, as their intended meaning is not formally/explicitly represented. In this paper we present a general methodology for automatically eliciting and representing the intended meaning of these structures, and for making this meaning available in domains like information integration and interoperability, web service discovery and composition, peer-to-peer knowledge management, and semantic browsers. We also present an implementation (called CTXMATCH2) of how such a method can be used for semantic interoperability.

    File: 4066-bouquet.pdf
    Filesize:121.3kb
    Uploaded:03-07-2007, 11:56

  • Multimedia Information Extraction in Ontology-based Semantic Annotation of Product Catalogues
    Paper-Roberto Bartolini, Emiliano Giovannetti, Simone Marchi, and Simonetta Montemagni, ILC-CNR. ABSTRACT: The demand for efficient methods for extracting knowledge from multimedia content has led to a growing research community investigating the convergence of multimedia and knowledge technologies. In this paper we describe a methodology for extracting multimedia information from product catalogues empowered by the synergetic use and extension of a domain ontology. The methodology was implemented in the Trade Fair Advanced Semantic Annotation Pipeline of the VIKE-framework.

    File: paper-41_SWAP2006.pdf
    Filesize:640.4kb
    Uploaded:28-12-2006, 15:08

  • Innovative Business Cases
    Up to date - Posters

    File: VIKEF_Poster3_A1.pdf
    Filesize:2444.3kb
    Uploaded:05-12-2006, 17:12

  • R&D Highlight - Multimedia Processing for Catalogues
    Up to date - Poster

    File: VIKEF_Poster2d_A3.pdf
    Filesize:2212.7kb
    Uploaded:05-12-2006, 17:11

  • R&D Highlight - Semantic Infusion for Content Digestion
    Up to date - Poster

    File: VIKEF_Poster2b_A3.pdf
    Filesize:2557kb
    Uploaded:05-12-2006, 17:10

  • R&D Highlight - Entity & Identity Management
    Up to date - Poster

    File: VIKEF_Poster2c_A3.pdf
    Filesize:2036.6kb
    Uploaded:05-12-2006, 17:08

  • R&D Highlight - Bibliographic citations in context
    Up to date - Poster

    File: VIKEF_Poster2a_A3.pdf
    Filesize:2295.8kb
    Uploaded:05-12-2006, 17:06

  • Virtual Information and Knowledge Environment Framework
    Up to date - Poster

    File: VIKEF_Poster1_A1.pdf
    Filesize:287.2kb
    Uploaded:05-12-2006, 17:05

  • Memory-based Object Recognition in digital Images
    Claudio Andreatta, Michela Lecca, Stefano Messelodi (2005). ABSTRACT: MEMORI (MEMory based Object Recognition in digital Images)1 is a system for the detection and recognition of objects in digital color images. The objects are stored in a database that represents the memory of the system. Each object is described by several shots taken from different points of view. Object detection is achieved by segmenting the input image using color and textural information, and grouping the obtained regions by analyzing their adjacency relationships and their visual similarity to the database objects. Database objects and region groups are described by vectors of low-level features, like color, texture and shape. Visual similarity is de_ned as the L1-distance among such vectors. The region grouping strategy is guided by heuristic rules mainly related to the distance of the groups to the database objects. The _nal result is a set of image portions each associated to one or more database objects along with a con_dence score.

    File: VMV2005.pdf
    Filesize:262.3kb
    Uploaded:05-12-2006, 16:02

  • Discourse and citation analysis with concept-matching
    Agnes Sandor, Aaron Kaplan, Gilbert Rondeau International Symposium : Discourse and document (ISDD), Caen, France, 15-16 June, 2006. ABSTRACT: We consider the problem of semantic annotation of semi-structured documents according to a target XML schema. The task is to annotate a document in a tree-like manner where the annotation tree is an instance of a tree class defined by DTD or W3C XML Schema descriptions. In the probabilistic setting, we cope with the tree annotation problem as a generalized probabilistic context-free parsing of an observation sequence where each observation comes with a probability distribution over terminals supplied by a probabilistic classifier associated with the content of documents. We determine the most probable tree annotation by maximizing the joint probability of selecting a terminal sequence for the observation sequence and the most probable parse for the selected terminal sequence.

    File: resultcitation.pdf
    Filesize:33kb
    Uploaded:05-12-2006, 16:01

  • Document annotation by active learning techniques
    Boris Chidlovskii, Loïc Lecerf ACM Document Engineering Symposium, Amsterdam, The Netherland, 10-13 oCT. 2006. ABSTRACT: We present a system for the semantic annotation of layout-oriented documents, with an integrated learning component. We introduce probabilistic learning methods on tree-like documents and we present different active learning techniques for training document annotation models. We report some preliminary results of deploying such active learning techniques on an important case of document collection annotation.

    File: annotationbyactive.pdf
    Filesize:153.4kb
    Uploaded:05-12-2006, 16:00

  • Using the author’s comments for knowledge discovery
    Agnes Sandor Semaine de la connaissance, Atelier texte et connaissance, Nantes, June 29, 2006. ABSTRACT: We present an approach to knowledge discovery based on the author’s comments on pieces of information he conveys in his text. To carry out this task we propose a pattern-matching framework based on syntactic analysis that is able to represent a large variety of expressions that convey the same comment type. Our method has been applied effectively in novelty detection and risk detection tasks. We propose this approach and framework as a complementary methodology of information extraction and subsequent text-mining.

    File: sandor.pdf
    Filesize:90.4kb
    Uploaded:05-12-2006, 15:54

  • Optimized XY-Cut for Text Ordering
    Jean-Luc Meunier ICDAR, International Conference on Document Analysis and Recognition, Seoul, Korea, Aug. 29-Sept. 1st, 2005. ABSTRACT: In this paper, we propose a fast method for determining the human reading order of the layout elements of a document page. The proposal includes a computationally tractable optimization approach to the problem. We also report on the performance of the method and discuss it, in light of related work.

    File: optimizedxy.pdf
    Filesize:162.8kb
    Uploaded:05-12-2006, 15:51

  • System for converting PDF documents into structured XML format
    Hervé Déjean, Jean-Luc Meunier 7TH IAPR Workshop on Document Analysis Systems, Nelson, New Zealand, 13-15 February 2006. ABSTRACT: We present in this paper a system for converting PDF legacy documents into structured XML format. This conversion system first extracts the different streams contained in PDF files (text, bitmap and vectorial images) and then applies different components in order to express in XML the logically structured documents. Some of these components are traditional in Document. Analysis, other more specific to PDF. We also present a graphical user interface in order to check, correct and validate the analysis of the components. We eventually report on two real user cases where this system was applied on.

    File: das06026.pdf
    Filesize:419.6kb
    Uploaded:05-12-2006, 15:50

  • A web based document harmonisation and annotation chain: from PDF to RDF
    Boris Chidlovskii, Thierry Jacquin, Olivier Fambon ACM Symposium on Document Engineering, Bristol, UK, 2-4 November, 2005. ABSTRACT: We propose a demonstration of a Web-based document harmonization and annotation chain developed within the VIKEF integrated project. The chain integrates a combination of Web Services in order to access, harmonize and semantically annotate remote document collections. Annotations are then mapped onto RDF descriptions that serve as a basis for building semantic-enabled services to support community processes.

    File: p13002jacquin.pdf
    Filesize:294.8kb
    Uploaded:05-12-2006, 15:48

  • APPEARANCE BASED PAINTINGS RECOGNITION FOR A MOBILE MUSEUM GUIDE
    Claudio Andreatta Fabrizio Leonardi (2006). ABSTRACT: This paper presents a prototype of a visual recognition system for a handheld interactive museum guide. Contextualized information about museum drawings may be obtained by the user, without any knowledge about how the system works by simply pointing a palmtop camera towards the painting and taking a shot. The system was tested and performance was found to be satisfactory in challenging environment conditions.

    File: VISAPP2006.pdf
    Filesize:203.8kb
    Uploaded:05-12-2006, 15:47

  • Object Recognition in Color Images by the Self Configuring System MEMORI
    Michela Lecca (2006). ABSTRACT: System MEMORI automatically detects and recognizes rotated and/or rescaled versions of the objects of a database within digital color images with cluttered background. This task is accomplished by means of a region grouping algorithm guided by heuristic rules, whose parameters concern some geometrical properties and the recognition score of the database objects. This paper focuses on the strategies implemented in MEMORI for the estimation of the heuristic rule parameters. This estimation, being automatic, makes the system a self configuring and highly user-friendly tool.

    File: IJSP.pdf
    Filesize:672.4kb
    Uploaded:05-12-2006, 15:41

  • A Self Configuring System for Object Recognition in Color Images
    Michela Lecca. MARCH 2006. ICPA2006. ABSTRACT: System MEMORI automatically detects and recognizes rotated and/or rescaled versions of the objects of a database within digital color images with cluttered background. This task is accomplished by means of a region grouping algorithm guided by heuristic rules, whose parameters concern some geometrical properties and the recognition score of the database objects. This paper focuses on the strategies implemented in MEMORI for the estimation of the heuristic rule parameters. This estimation, being automatic, makes the system a highly user-friendly tool.

    File: ICPA2006.pdf
    Filesize:495.7kb
    Uploaded:05-12-2006, 15:38

  • Structuring documents according to their table of contents
    Hervé Déjean, Jean-Luc Meunier DocEng 05, Bristol, UK, November 2-4, 2005. ABSTRACT: In this paper, we present a method for structuring a document according to the information present in its Table of Contents. The detection of the ToC as well as the determination of the parts it refers to in the document body rely on a series of generic properties characterizing any ToC, while its hierarchization is achieved using clustering techniques. We also report on the robustness and performance of the method before discussing it, in light of related work.

    File: fp10640dejean.pdf
    Filesize:305.2kb
    Uploaded:05-12-2006, 15:36

  • Recognition and Reconstruction of Partially Occluded Objects
    Michela Lecca and Stefano Messelodi. ICCS2006. ABSTRACT: A new automatic system for the recognition and reconstruction of rescaled and/or rotated partially occluded objects is presented. The objects to be recognized are described by 2D views and each view is occluded by several half-planes. The whole object views and their visible parts (linear cuts) are then stored in a database. To establish if a region R of an input image represents an object possibly occluded, the system generates a set of linear cuts of R and compare them with the elements in the database. Each linear cut of R is associated to the most similar database linear cut. R is recognized as an instance of the object O if the majority of the linear cuts of R are associated to a linear cut of views of O. In the case of recognition, the system reconstructs the occluded part of R and determines the scale factor and the orientation in the image plane of the recognized object view. The system has been tested on two different datasets of objects, showing good performance both in terms of recognition and reconstruction accuracy.

    File: ICCS2006.pdf
    Filesize:437.3kb
    Uploaded:05-12-2006, 15:34

  • A probabilistic learning method for XML annotation of documents
    Boris Chidlovskii, Jérôme Fuselier IJCAI, 19th International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, 30 July - 5 August 2005. ABSTRACT: We consider the problem of semantic annotation of semi-structured documents according to a target XML schema. The task is to annotate a document in a tree-like manner where the annotation tree is an instance of a tree class defined by DTD or W3C XML Schema descriptions. In the probabilistic setting, we cope with the tree annotation problem as a generalized probabilistic context-free parsing of an observation sequence where each observation comes with a probability distribution over terminals supplied by a probabilistic classifier associated with the content of documents. We determine the most probable tree annotation by maximizing the joint probability of selecting a terminal sequence for the observation sequence and the most probable parse for the selected terminal sequence.

    File: 501.pdf
    Filesize:162.4kb
    Uploaded:05-12-2006, 15:30

  • From legacy Document to XML: A conversion Framework
    Jean-Pierre Chanod, Boris Chidlovskii, Hervé Déjean, Olivier Fambon, Jérôme Fuselier, Thierry Jacquin, Jean-Luc Meunier 9th European Conference on Research and Advanced Technology for Digital Libraries, Vienna, Austria, September 18-23, 2005. ABSTRACT: We present an integrated framework for the document onversion from legacy formats to XML format. We describe the LegDoC project, aimed at automating the conversion of layout annotations layout-oriented formats like PDF, PS and HTML to semantic-oriented annotations. A toolkit of different components covers complementary techniques the logical document analysis and semantic annotations with the methods of machine learning. We use a real case conversion project as a driving example to exemplify different techniques implemented in the project.

    File: fromlegacy.pdf
    Filesize:700.9kb
    Uploaded:05-12-2006, 15:27

  •