Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
public:seminars_manifestations [2016/07/22 07:35] – philippereneviergonin | public:seminars_manifestations [2017/03/08 09:17] (current) – fmichel | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | **[[http:// |
Line 5: | Line 5: | ||
**SPARKS public calendar**: in [[https:// | **SPARKS public calendar**: in [[https:// | ||
+ | |||
+ | ====== 2017 ====== | ||
+ | |||
+ | ===== Séminaires ===== | ||
+ | |||
+ | ==== Marie-Christine ROUSSET: Datalog revisited for reasoning in Linked Data ==== | ||
+ | When: 3rd March 2017, 10h00 \\ | ||
+ | Where: POLYTECH, Templiers 2, room 307\\ | ||
+ | |||
+ | **Abstract**\\ | ||
+ | Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with inference paves the way to make the Semantic Web a reality. In this presentation, | ||
+ | equipping RDF triple stores with Datalog inference rules. This rule language allows to capture in a uniform manner OWL constraints that are | ||
+ | useful in practice, such as property transtivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. I will illustrate the expressivity of this framework for modeling Linked Data applications and its genericity for developing inference algorithms. In particular, we will show how it allows to model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. I will also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction. | ||
+ | |||
+ | ===== Soutenances ===== | ||
+ | |||
+ | ==== PhD Thesis Defense - Franck MICHEL ==== | ||
+ | |||
+ | '' | ||
+ | |||
+ | === Abstract === | ||
+ | |||
+ | **Title: Integrating Heterogeneous Data Sources in the Web of Data** | ||
+ | |||
+ | To a great extent, RDF-based data integration as well as the Web of Data depend on the ability to reach out legacy data locked in data silos where they are invisible to the web. In the last 15 years, various works have tackled the problem of exposing structured data in the Resource Description Framework (RDF), starting with relational databases (RDB), spreadsheets and the XML data format. Meanwhile, the overwhelming success of NoSQL databases has made the database landscape more diverse than ever. So far, though, these databases remain inaccessible to RDF-based data integration systems, and although the data they host may be of interest to a large audience, they remain invisible to the Web of Data. Hence, to harness the potential of NoSQL databases and more generally non-RDF data sources, the objective of this thesis is to enable RDF-based data integration over heterogeneous databases and, in particular, to reconcile the Semantic Web with the NoSQL family of databases. | ||
+ | |||
+ | Firstly, we propose a generic mapping language, xR2RML, able to describe the mapping of varying types of databases into an arbitrary RDF representation. This language relies on and extends previous works on the translation of RDBs, CSV and XML into RDF. Secondly, we propose to use such an xR2RML mapping either to materialize RDF data or to dynamically evaluate SPARQL queries on the native database. To spur the development of SPARQL interfaces over legacy databases, we propose a two-step approach. The first step performs the translation of a SPARQL query into a pivot abstract query based on the xR2RML mapping of the target database to RDF. In the second step, the abstract query is translated into a concrete query, taking into account the specificities of the database query language. Great care is taken of the query optimization opportunities, | ||
====== 2016 ====== | ====== 2016 ====== | ||
Line 50: | Line 77: | ||
===== Soutenances ===== | ===== Soutenances ===== | ||
+ | |||
+ | ==== PhD Thesis Defense - Atheer AL-NAJDI ==== | ||
+ | |||
+ | '' | ||
+ | |||
+ | === Abstract === | ||
+ | |||
+ | **Title: | ||
+ | |||
+ | Clustering is the process of partitioning a dataset into groups, so that the instances in the same group are more similar to each other than to instances in any other group. Many clustering algorithms were proposed, but none of them proved to provide good quality partition in all situations. Consensus clustering aims to enhance the clustering process by combining different partitions obtained from different algorithms to yield a better quality consensus solution. In this work, a new consensus clustering method, called MultiCons, is proposed. It uses the frequent closed itemset mining technique in order to discover the similarities between the different base clustering solutions. The identified similarities are presented in a form of clustering patterns, that each defines the agreement between a set of base clusters in grouping a set of instances. By dividing these patterns into groups based on the number of base clusters that define the pattern, MultiCons generates a consensus | ||
+ | solution from each group, resulting in having multiple consensus candidates. These different solutions are presented in a tree-like structure, called ConsTree, that facilitates understanding the process of building the multiple consensuses, | ||
+ | Five consensus functions are proposed in this work in order to build a consensus solution from the clustering patterns. Approach 1 is to just merge any intersecting clustering patterns. Approach 2 can either merge or split intersecting patterns based on a proposed measure, called intersection ratio. Approach 3 differs from the previous approaches by searching for the best similar pattern before making a merge/split decision, and, in addition, it uses the average intersection ratio. While approach 3 works sequentially on the clustering patterns, approach 4 uses a similarity matrix of intersection ratios to search for the best merge/ | ||
+ | |||
+ | **Keywords**: | ||
+ | |||
+ | === Résumé === | ||
+ | |||
+ | **Titre : Une approche basée sur les motifs fermés pour résoudre le problème de clustering par consensus** | ||
+ | |||
+ | Le clustering est le processus de partitionnement d’un ensemble de données en groupes, de sorte que les instances du même groupe sont plus semblables les unes aux autres qu’avec celles de tout autre groupe. De nombreux algorithmes de clustering ont été proposés, mais aucun d’entre eux ne s’avère fournir une partition des données pertinente dans toutes les situations. Le clustering par consensus vise à améliorer le processus de regroupement en combinant différentes partitions obtenues à partir de divers algorithmes afin d’obtenir une solution de consensus de meilleure qualité. Dans ce travail, une nouvelle méthode de clustering par consensus, appelée MultiCons, est proposée. Cette méthode utilise la technique d’extraction des itemsets fréquents fermés dans le but de découvrir les similitudes entre les différentes solutions de clustering dits de base. Les similitudes identifiées sont représentées sous une forme de motifs de clustering, chacun définissant un accord entre un ensemble de clusters de bases sur le regroupement d’un ensemble d’instances. En traitant ces motifs par groupes, en fonction du nombre de clusters de base qui définissent le motif, la méthode MultiCons génère une solution de consensus pour chaque groupe, générant par conséquence plusieurs consensus candidats. Ces différentes solutions sont ensuite représentées dans une structure arborescente appelée arbre de consensus, ou ConsTree. Cette représentation graphique facilite la compréhension du processus de construction des multiples consensus, ainsi que les relations entre les instances et les structures d’instances dans l’espace de données. | ||
+ | Cinq approches de clustering par consensus, permettant de construire une solution de consensus à partir des motifs de clustering, sont proposées dans ce travail. La première approche fusionne simplement successivement tous les motifs de clustering qui se recoupent. La seconde approche va soit fusionner, soit diviser les motifs qui se recoupent selon le résultat d’une nouvelle mesure appelée ratio d’intersection. La troisième approche diffère des approches précédentes en recherchant, | ||
+ | |||
+ | **Mots clés** : Clustering ; Classification non-supervisée ; Clustering par consensus ; Ensembles clustering ; Itemsets fréquents fermés. | ||
+ | |||
+ | |||
+ | ==== PhD Thesis Defense - Romaric Pighetti ==== | ||
+ | |||
+ | '' | ||
+ | |||
+ | === Abstract === | ||
+ | |||
+ | **Title: | ||
+ | |||
+ | Given the ever growing amount of visual content available on the Internet, the need for systems able to search through this content has grown. | ||
+ | Content based image retrieval systems have been developed to address this need. | ||
+ | But with the growing size of the databases, new challenges arise. | ||
+ | In this thesis, the fine grained classification problem is studied in particular. | ||
+ | It is first shown that existing techniques, and in particular the support vector machines which are one of the best image classification technique, have some difficulties in solving this problem. | ||
+ | They often lack of exploration in their process. | ||
+ | Then, evolutionary algorithms are considered to solve the problem, for their balance between exploration and exploitation. | ||
+ | But their performances are not good enough either. | ||
+ | Finally, an hybrid system combining an evolutionary algorithm and a support vector machine is proposed. | ||
+ | This system uses the evolutionary algorithm to iteratively feed the support vector machine with training samples. | ||
+ | The experiments conducted on Caltech-256, | ||
+ | |||
+ | **Keywords**: | ||
+ | |||
+ | |||
+ | === Résumé === | ||
+ | |||
+ | **Titre : Une méthode hybride pour la classification d' | ||
+ | |||
+ | La quantité d' | ||
+ | Les systèmes de recherche d' | ||
+ | Mais les bases de données grandissant, | ||
+ | Dans cette thèse, la classification à grain fin est étudiée en particulier. | ||
+ | Elle consiste à séparer des images qui sont relativement semblables visuellement mais représentent différents concepts, et à regrouper des images qui sont différentes visuellement mais représentent le même concept. | ||
+ | Il est montré dans un premier temps que les techniques classiques de recherche d' | ||
+ | Même les techniques utilisant les machines à vecteur de support (SVM), qui sont très performants pour la classification, | ||
+ | Ces techniques n' | ||
+ | D' | ||
+ | Toutefois, leurs performances restent encore limitées. | ||
+ | Par conséquent, | ||
+ | L' | ||
+ | Ce système est évalué avec succès sur la base de données Caltech-256 contenant envieront 30 000 images réparties en 256 catégories. | ||
+ | |||
+ | **Mots clés** : Recherche d' | ||
+ | |||
+ | ==== PhD Thesis Defense - Zide Meng ==== | ||
+ | |||
+ | '' | ||
+ | |||
+ | === Abstract === | ||
+ | **Title: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the Web** | ||
+ | |||
+ | We propose an approach to detect topics, overlapping communities of interest, expertise, trends and activities in user-generated content sites and in particular in question-answering forums such as StackOverFlow. We first describe QASM (Question & Answer Social Media), a system based on social network analysis to manage the two main resources in question-answering sites: users and contents. We also introduce the QASM vocabulary used to formalize both the level of interest and the expertise of users on topics. We then propose an efficient approach to detect communities of interest. It relies on another method to enrich questions with a more general tag when needed. We compared three detection methods on a dataset extracted from the popular Q&A site StackOverflow. Our method based on topic modeling and user membership assignment is shown to be much simpler and faster while preserving the quality of the detection. We then propose an additional method to automatically generate a label for a detected topic by analyzing the meaning and links of its bag of words. We conduct a user study to compare different algorithms to choose the label. Finally we extend our probabilistic graphical model to jointly model topics, expertise, activities and trends. We performed experiments with real-world data to confirm the effectiveness of our joint model, studying the users’ behaviors and topics dynamics. | ||
+ | |||
+ | |||
+ | === Résumé === | ||
+ | |||
+ | **Titre : Analyse temporelle et sémantique des réseaux sociaux typés à partir du contenu de sites | ||
+ | généré par des utilisateurs sur le Web** | ||
+ | |||
+ | Nous proposons une approche pour détecter les sujets, les communautés d' | ||
+ | |||
+ | |||
+ | ==== PhD Thesis Defense - Papa Fary Diallo ==== | ||
+ | |||
+ | '' | ||
+ | |||
+ | === Abstract === | ||
+ | **Title: Sociocultural and Temporal Aspects in Ontologies dedicated to Virtual Communities** | ||
+ | |||
+ | **Keywords: Semantic web, Social web, Ontologies, Virtual Communities, | ||
+ | |||
+ | This thesis is set in a research effort that aims to model sociocultural and temporal aspects to allow Senegalese communities to share and to co-construct their sociocultural knowledge. Indeed, with the globalization it is very common to meet African youth and particularly Senegalese youth knowing more about the geography of the West than their own countries. Thus, to refresh the memory of our fellow citizens and revive the many stories that accompany the creation and daily life of the different Senegalese territories, | ||
+ | |||
+ | Our proposals are based on social and semantic web technologies. indeed, social web proposes a framework where value is created by the aggregation of many individual user contributions. Thereby, social web makes easier corpus co-construction. The semantic web enables to find, to combine and to share resources, not only between humans but also between machines. The combination of these two technologies enables Senegalese communities to share and co-construct their cultural heritage in a collaborative and semantic environment . | ||
+ | |||
+ | Our contributions include to (i) propose ontologies to annotate sociocultural resources and (ii) provide a framework for communities to share and co-construct their knowledge. Ontologies are backbone of the semantic web and allow to characterize a domain by describing the basic concepts and the relations between them. Thus, we have defined two ontologies : 1) a sociocultural ontology based on cultural-historical activity theory and 2) a temporal ontology to annotate temporally sociocultural resources. We also proposed a virtual community called cultural knowledge-building community which is an adaptation of the knowledge-building community in the cultural field. | ||
+ | |||
+ | |||
+ | === Résumé === | ||
+ | |||
+ | **Titre : Aspects Socioculturels et Temporels dans les Ontologies pour les Communautés Virtuelles | ||
+ | résumé** | ||
+ | |||
+ | **Mots clés : Web sémantique, | ||
+ | |||
+ | Les travaux que nous présentons dans cette thèse concernent la modélisation des aspects socioculturels et temporels pour permettre aux communautés sénégalaises de partager et de co-construire leur connaissances socioculturelles. En effet, avec la mondialisation la nouvelle génération africaine et particulièrement sénégalaise a de moins en moins de connaissances sur les aspects socioculturels de leur environnement. Ainsi pour rafraîchir la mémoire de nos concitoyens et redonner vie aux nombreux récits qui accompagnent la création et la vie au quotidien des différents terroirs sénégalais, | ||
+ | |||
+ | Nos propositions s' | ||
+ | |||
+ | Nos contributions consistent à (i) proposer des ontologies pour annoter des ressources socioculturelles et (ii) proposer un cadre permettant aux communautés de partager et de co-construire leur connaissances. Les ontologies représentent le socle du Web sémantique et permettent de caractériser un domaine en décrivant les concepts fondamentaux et les relations entre eux. Ainsi, nous avons défini deux ontologies : 1) une ontologie socioculturelle reposant sur la théorie historico-culturelle de l' | ||
====== 2015 ====== | ====== 2015 ====== | ||
===== Soutenances ===== | ===== Soutenances ===== | ||
- | |||