Dr Luca M. Ghiringhelli leads the group “Big-Data analytics for Materials Science" in the Novel Materials Discovery (NOMAD) Laboratory at Fritz Haber Institute of the Max Planck Society, in Berlin. Formerly, he has led the group "Ab initio statistical mechanics of cluster catalysis and corrosion" in the theory group at the same institute. His background is in computational statistical mechanics and electronic structure methods, applied to the evaluation of thermodynamic and kinetic properties of bulk materials, surfaces, and nano-clusters. Within the NOMAD Lab, he leads the development and application of methods based on compressed sensing, symbolic regression, subgroup discovery, and deep learning to the modelling of big data in materials science. His focus is on methods that yield interpretable models and can cope with “small data” for training. He also co-led the development of the hierarchical and extensible metadata infrastructure for the NOMAD Lab. Since January 2018, he is co-leading the Psi-k working group on “High-throughput screening and data analytics”.
Ontologies in Computational Materials Science: The NOMAD experience
With the tremendous increase in the amount of data in materials science, new ways to store and annotate data are necessary to fulfill the FAIR principles – and to do efficient, good, and new science. Consequently, ontologies have been of increased interest as they do not only allow storing and annotating but also semantically linking data even across domains. The Novel-Materials Discovery (NOMAD) Repository is the largest database in materials science and provides
a normalized, source-independent form of these data in the NOMAD Archive using the NOMAD Metainfo [L. M. Ghiringhelli et al., npj Comput. Mater. 3, 46 (2017)] as metadata schema. The NOMAD Metainfo includes a number of relations between concepts and therefore already goes beyond the simple metadata concept. We have converted it to an ontology and extended it to increase semantics based on the European Materials and Modeling Ontology (EMMO).
Furthermore, within the NOMAD ecosystem, we have created an ontology collection covering materials structures and properties in a more general semantic way. We demonstrate how this may enable connecting multiple sources of knowledge and allow for semantic searches.