fmichel

Franck Michel
Research activities
Publications
Contact
Selected talks
Open Data: creation/publication of open datasets
Software Development
Background and Position
Organizing and Program Committees
Wild ideas
- Large Language Models as the components of a conscious AI?
- Frugality by design: less is good

Franck Michel

I am a CNRS research engineer involved in the integration of heterogeneous data and their publication and sharing as Knowledge Graphs on the Web, using knowledge engineering, the Semantic Web and Linked Open Data technologies.

I am a member of I3S laboratory, and a member of Inria's Wimmics team.

Research activities

Knowledge Graphs, Integration of Heterogeneous Data

I am involved in research activities meant to enable the integration of heterogeneous data based on a knowledge engineering approach, as well as the sharing and reuse of these data. My work addresses several research questions:

How to build Knowledge Graphs and foster data reuse by complying with the FAIR principles?
How to overcome data structural and semantic heterogeneity in order to reconcile and make sense of large datasets distributed at Web-scale?
How to enable the Web-scale discovery and consumption of data?
How to make sense of large scientific corpora and support Open Science by allowing researcher to explore and visualize information extracted from research articles?
How to “talk to the data” by translating natural language questions into queries in structured query languages such as SPARQL?
How to enrich existing knowledge bases by extracted knowledge graphs from texts?

These research questions are applied to various domains, in particular agronomy, agriculture and biodiversity, but also ancient literature and music.

Research projects and communities

Here are some projects I am or was involved in:

ISSA (Collex-Persée): Semantic Indexing of a Scientific archive and Associated Services
D2KAB (ANR): From Data to Knowledge in Agronomy, Agriculture and Biodiversity
DeKaloG (ANR): Decentralized Knowledge Graphs
SPARQL micro-services
TAXREF-LD: the French Linked Data Taxonomic Registry
DBpedia French chapter
Covid on the Web

Here are the community projects I'm currently involved in:

Bioschemas: Schema.org for Life Sciences
Knowledge Graph Construction W3C Community Group

Publications

Complete list of publications and communications: HAL CV.

Also find me on ReasearchGate.

Contact

Address:
Université Côte d’Azur, CNRS, Inria - I3S, UMR 7271
930 route des Colles - Bât. Les Templiers
BP 145 - 06903 Sophia Antipolis CEDEX - France

Email: franck [dot] michel [at] inria [dot] fr

Find me on: ResearchGate, Github, LinkedIn, Twitter, SlideShare, Flickr, Instagram

Selected talks

Pay Attention: A Call to Regulate the Attention Market & Prevent Algorithmic Emotional Governance

Fabien Gandon, Franck Michel. Interview for The Creative Process, Feb. 2025.

Listen on Spotify

Listen on Apple Podcast

Recherche, exploration et bibliométrie dans une archive scientifique ouverte

Given 2024-09-10.

Open Science, reproducible research, and the citation of articles, code and data alike

Given 2024-04-04.

ISSA: Generic Knowledge Model and Visualization tools to Help Scientists Make Sense of Archive

Wimmics Monthly Seminar 2022-12-15 / ISWC 2022 resource track replay

Covid-on-the-Web: Knowledge Graph and Services to Advance COVID-19 Research

Presented at the ISWC 2020 conference, resource track.

Bioschemas: Marking up biodiversity websites for data discovery & integration

TDWG webinar series, 2021-03.

Integration of biodiversity data from web pages to knowledge graphs, a computer scientist view point

DIADE research unit seminars (http://diade.ird.fr), 2021-04-13.

Open Data: creation/publication of open datasets

Here are some open datasets for which I participated to the creation and/or publication.

ISSA Agritrop Dataset, Semantic index of the Agritrop open scientific archive. github DOI

TAXREF-LD, Linked Data knowledge graph of the French taxonomic register. 2017-2024. github sparql article DOI

Covid-on-the-Web. Knowledge graph produced by processing the scholarly articles of the COVID-19 Open Research Dataset (CORD-19). 2020. github sparql article DOI.

WASABI RDF Knowledge Graph. An RDF representation of the WASABI corpus of songs enriched with metadata extracted from music databases on the Web, and resulting from the processing of song lyrics and from audio analysis. 2020. github sparql article DOI

WeKG-MF, Weather Knowledge Graph of Météo France Meteorological Observations. 2022. github sparql article DOI

WheatGenomicsSLKG, Wheat Genomics Scientific Literature Knowledge Graph. 2023. github sparql DOI

Wheat Observations Knowledge Graph. Soft wheat phenotype observations data including the result of observation campaigns carried out on micro-parcels located in France between 1999 and 2015. Relies on the Plant Phenotype Experiment Ontology (PPEO) and the CO_321 Wheat Crop Ontology. 2024. DOI

Number of public photos uploaded to Flickr from 2004 to 2021. This dataset reports the number of photos uploaded to Flickr every day, hour by hour (CET) from 2004 to 2021. Only public photos are considered, private photos as well as other type of material (e.g. videos) are not accounted for. DOI github

Software Development

ISSA visualization and search web application: Franck MICHEL, Youssef Mekouar (2022). Visualization: github DOI. Backend: github DOI

ISSA Processing Pipeline: Anna Bobasheva, Franck MICHEL (2022). github DOI

WheatGenomicsSLKG visualization and search web application: Franck MICHEL, Youssef Mekouar (2022). Visualization: github DOI. Backend: github DOI

SPARQL Micro-Services: Querying Web APIs with SPARQL. Franck Michel. 2018. github

Morph-xR2RML: MongoDB-to-RDF translation and SPARQL rewriting: Franck Michel, Freddy Pryiatna. Implementation of the xR2RML mapping language for MongoDB databases. 2017. github DOI

The VO Administration and operations PORtal (VAPOR). Franck Michel, Flavien Forestier. 2014. web DOI

EGI Virtual Organisations Support Tools. Franck Michel. 2013. web DOI

NeuroLOG platform. Alban Gaignard, Franck Michel, Johan Montagnat, Javier Rojas Balderrama, Farooq Ahmad, Bacem Wali. 2008. web

Background and Position

CNRS Research engineer (IR), Université Côte d'Azur, CNRS, Inria, I3S laboratory. Jan. 2011 until now.
PhD in Computer Sciences at Université Côte d'Azur, March 2017. Manuscript
Expert software engineer in IRISA team VisAGeS, May 2008 to Dec. 2010
Expert telecom engineer, company Capgemini Telecom, Media Networks, 1999 à 2008
Development engineer, company Nortel Networks France, 1995 à 1999
Engineering degree in Computer Sciences, INSA de Rennes, 1995

Organizing and Program Committees

I was/am a member of the program committees for the following conferences and/or workshops:

SEMANTiCS 2024, 20th International Conference on Semantic Systems
ECAI 2024, European Conference on AI
ESWC 2024, The Extended Semantic Web Conference
The Web Conference 2024, The Extended Semantic Web Conference
SEMANTiCS 2023, 19th International Conference on Semantic Systems
ESWC 2022, The Extended Semantic Web Conference
KGCW 2022, Third International Workshop on Knowledge Graph Construction
SCG 2021, First workshop on Squaring the circle on graphs
KGCW 2021, Second International Workshop on Knowledge Graph Construction
ESWC 2021, The Extended Semantic Web Conference
ICCS 2020, The International Conference on Computational Science
IJCAI 2020, 29th International Joint Conference on Artificial Intelligence
SEMANTiCS 2019, 15th International Conference on Semantic Systems
Knowledge Graph Building (KBD), workshop of the Extended Semantic Web Conference 2019 (ESWC)
Hypermedia Multi-Agent Systems (HyperAgents 2019), workshop of the Web Conference 2019
EKAW 2018, 21th International Conference on Knowledge Engineering and Knowledge Management
SEMANTiCS 2018, 14th International Conference on Semantic Systems
ISWC 2018, 17th International Semantic Web Conference
ICCS 2018, 23rd International Conference on Conceptual Structures
WWW 2018, The Web Conference 2018
ISWC 2017, 16th International Semantic Web Conference
SEMANTiCS 2017, 13th International Conference on Semantic Systems
ICCS 2016, 22nd International Conferences on Conceptual Structures
SI&IA 2015, Systèmes d'Information et Intelligence Artificielle 2015

I was/am a member of the organizing committees for the following conferences and/or workshops:

Ontology Alignment Evaluation Initiative 2022: ontology provider for the complex alignment and biodiversity tracks
Ontology Alignment Evaluation Initiative 2021: ontology provider for the complex alignment and biodiversity tracks
TDWG 2021 Symposium on Connecting biodiversity data with knowledge graphs
S4Biodiv : 3rd International Workshop on Semantics for Biodiversity
JDEV2020 : Journées CNRS du développement logiciels
APSEM2019 : écosystèmes pour la science ouverte et recherche par les données
APSEM2018 : Apprentissage et sémantique
JDEV2017 : Journées CNRS du développement logiciels

Wild ideas

Large Language Models as the components of a conscious AI?

AIs, and LLMs in particular, are not conscious. They are reactive systems, they respond to an input by producing an output. By contrast, consciousness can be defined as the ability to form thoughts for oneself, without the need for external stimulus.

What if we fine-tuned several LLMs to collaborate together, following the model of human psyche.

The “conscious” model would be fine-tuned to remain in the realm of values, morality, norms, and logical thinking. This is the one that would interact with the “outside” world and provide material to the “unconscious” model.
The “unconscious” model (or “subconscious” depending on the definitions) would be fine-tuned to phrase drive, desires, regardless of any norms nor value system.
The “preconscious” model would be fine-tuned to filter/rewrite outputs of the “unconscious” to let only acceptable outputs make their way to the “conscious”, while also providing it with material in a feed-back loop.

This way, we could imagine being able to design some sort of a conscious AI system. But this raises multiple questions: How would it be bootstrapped? Individually, each of the 3 LLMs remains a question-answering system, it does not take the initiative of producing an output. So how to start this, and once this starts, how to control the flow? If we skip the conscious model (to try and simulate sleep) and leave the unconscious model talk with itself, could it come up with dreams?

Frugality by design: less is good

Multiple tools of the daily life consume energy and/or resource even though that's not the intend of the user. These tools must be redesigned with a specific bias towards frugality.

This can be implemented as a default behavior, as a nudge etc. Examples:

Mixer tap delivers heated water by default: In the middle position, most mixer taps mix half ambient temperature water and half heated water. Although users may not need heated water. These should be redesigned with a middle position that only delivers ambient temperature water, so that getting heated water will require a deliberate action from the user.
Public space fountain water delivers cooled by default: Fountains in public spaces usually deliver cooled water by default although users may not want that. These should be redesigned with a default ambient temperature water, so that getting cooled water will require a deliberate action from the user.

Table of Contents