Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning

Reymond, Mathieu; Hayes, Conor F.; WILLEM, Lander; Radulescu, Roxana; ABRAMS, Steven; Roijers, Diederik M.; Howley, Enda; Mannion, Patrick; HENS, Niel; Nowe, Ann; LIBIN, Pieter

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/43004

Title:	Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning
Authors:	Reymond, Mathieu Hayes, Conor F. WILLEM, Lander Radulescu, Roxana ABRAMS, Steven Roijers, Diederik M. Howley, Enda Mannion, Patrick HENS, Niel Nowe, Ann LIBIN, Pieter
Issue Date:	2024
Publisher:	PERGAMON-ELSEVIER SCIENCE LTD
Source:	EXPERT SYSTEMS WITH APPLICATIONS, 249 (Art N° 123686)
Abstract:	Infectious disease outbreaks can have a disruptive impact on public health and societal processes. As decisionmaking in the context of epidemic mitigation is multi -dimensional hence complex, reinforcement learning in combination with complex epidemic models provides a methodology to design refined prevention strategies. Current research focuses on optimizing policies with respect to a single objective, such as the pathogen's attack rate. However, as the mitigation of epidemics involves distinct, and possibly conflicting, criteria (i.a., mortality, morbidity, economic cost, well-being), a multi -objective decision approach is warranted to obtain balanced policies. To enhance future decision -making, we propose a deep multi -objective reinforcement learning approach by building upon a state-of-the-art algorithm called Pareto Conditioned Networks (PCN) to obtain a set of solutions for distinct outcomes of the decision problem. We consider different deconfinement strategies after the first Belgian lockdown within the COVID-19 pandemic and aim to minimize both COVID-19 cases (i.e., infections and hospitalizations) and the societal burden induced by the mitigation measures. As such, we connected a multi -objective Markov decision process with a stochastic compartment model designed to approximate the Belgian COVID-19 waves and explore reactive strategies. As these social mitigation measures are implemented in a continuous action space that modulates the contact matrix of the age -structured epidemic model, we extend PCN to this setting. We evaluate the solution set that PCN returns, and observe that it explored the whole range of possible social restrictions, leading to high -quality trade-offs, as it captured the problem dynamics. In this work, we demonstrate that multi -objective reinforcement learning adds value to epidemiological modeling and provides essential insights to balance mitigation policies.
Notes:	Reymond, M (corresponding author), Vrije Univ Brussel, Brussels, Belgium. mathieu.reymond@vub.be; c.hayes13@nuigalway.ie; lander.willem@uantwerpen.be; roxana@ai.vub.ac.be; steven.abrams@uantwerpen.be; diederik.roijers@vub.be; enda.howley@nuigalway.ie; patrick.mannion@nuigalway.ie; niel.hens@uhasselt.be; ann.nowe@ai.vub.ac.be; pieter.libin@vub.be
Keywords:	Multi-objective reinforcement learning;Epidemic control;COVID-19 epidemic models
Document URI:	http://hdl.handle.net/1942/43004
ISSN:	0957-4174
e-ISSN:	1873-6793
DOI:	10.1016/j.eswa.2024.123686
ISI #:	001224116400001
Rights:	2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Category:	A1
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning.pdf	Published version	915.9 kB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM