Please use this identifier to cite or link to this item:
http://hdl.handle.net/1942/39389
Title: | Materials design through ensemble learning: When the average model knows best | Authors: | VANPOUCKE, Danny E.P. Mehrkanoon, Siamak Bernaerts, Katrien Van Knippenberg, O. S. J. Hermans, K |
Issue Date: | 2022 | Publisher: | Source: | Belgian Physical Society, Tabloo, Dessel, 18 May 2022 | Abstract: | Machine Learning plays an ever more important role in modern materials-design and-discovery presenting a steady flow of new discoveries. Unfortunately, these achievements are generally rooted in large data sets. Although such big data sets are becoming more common place, they are generally not representative for the day-today work performed by materials researchers, where large numbers of samples are often unfeasible due to production-cost or-time, or availability of raw materials. In this work, we investigate the impact of very small data sets (<25 samples) on model quality and show how even for these data sets high quality models can be constructed. Machine Learning in small data sets Due to the success of Machine Learning within the context of large data sets, there is a natural interest to apply these methods in the context of small data sets and also reap their rewards here. The use of artificial intelligence and Machine Learning is these cases is generally aimed at improved design of experiments for materials optimisation, often in combination with robotic automation. Within this context the active learning approach comes naturally,[1] as it starts from a small data (sub)set, which is incrementally increased through the addition of the most useful data points in the master data set. Within the context of design of experiments, this would be newly created samples. Alternately, several authors have focussed on (small) deep neural networks in combination with small data sets (50 to several 100 samples), showing reasonable quality models.[2] These examples show that, even in the context of small data sets, Machine Learning can be successful for materials Figure 1: Modelling small data sets. (a) schematic representation of the problem. (b) and (c) heatmaps of ensembles of 1000 model instances for a linear and non-linear data set of 20 data points.[3] | Other: | Abstract to BPS 2022 conference; Oral presentation. | Keywords: | polyester;poly(ethylene imine);structure-property relationships;machine learning | Document URI: | http://hdl.handle.net/1942/39389 | Category: | C2 | Type: | Conference Material |
Appears in Collections: | Research publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
BPS2022_DannyEPVanpoucke.pdf Restricted Access | Conference material | 168.8 kB | Adobe PDF | View/Open Request a copy |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.