DIVERSE: Disagreement-Inducing Vector Evolution for Rashomon Set Exploration

EERLINGS, Gilles; ZOOMERS, Brent; LIESENBORGS, Jori; ROVELO RUIZ, Gustavo; LUYTEN, Kris

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/48772

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Luyten, Kris	-
dc.contributor.advisor	Rovelo Ruiz, Gustavo	-
dc.contributor.author	EERLINGS, Gilles	-
dc.contributor.author	ZOOMERS, Brent	-
dc.contributor.author	LIESENBORGS, Jori	-
dc.contributor.author	ROVELO RUIZ, Gustavo	-
dc.contributor.author	LUYTEN, Kris	-
dc.date.accessioned	2026-03-18T13:08:30Z	-
dc.date.available	2026-03-18T13:08:30Z	-
dc.date.issued	2026	-
dc.date.submitted	2026-03-09T10:51:42Z	-
dc.identifier.citation	Proceedingsbook The Fourteenth International Conference on Learning Representations, OpenReview.net, (Art N° 18990)	-
dc.identifier.uri	http://hdl.handle.net/1942/48772	-
dc.description.abstract	We propose DIVERSE, a framework for systematically exploring the Rashomon set of deep neural networks, the collection of models that match a reference model’s accuracy while differing in their predictive behavior. DIVERSE augments a pretrained model with Feature-wise Linear Modulation (FiLM) layers and uses Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to search a latent modulation space, generating diverse model variants without retraining or gradient access. Across MNIST, PneumoniaMNIST, and CIFAR-10, DIVERSE uncovers multiple high-performing yet functionally distinct models. Our experiments show that DIVERSE offers a competitive and efficient exploration of the Rashomon set, making it feasible to construct diverse sets that maintain robustness and performance while supporting well-balanced model multiplicity. While retraining remains the baseline to generate Rashomon sets, DIVERSE achieves comparable diversity at reduced computational cost.	-
dc.description.sponsorship	This work was funded by the Flemish Government under the “Onderzoeksprogramma Artifici¨ele Intelligentie (AI) Vlaanderen” programme, R-13509. This research was partly funded by the Special Research Fund (BOF) of Hasselt University (R-14436) and the FWO fellowship grant (1SHDZ24N). The resources and services used in this work were partly provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation - Flanders (FWO) and the Flemish Government.	-
dc.language.iso	en	-
dc.publisher	OpenReview.net	-
dc.rights	CC BY 4.0	-
dc.subject.other	Rashomon Set	-
dc.subject.other	Rashomon Effect	-
dc.subject.other	Feature-wise Linear Modulation (FiLM)	-
dc.subject.other	CMA-ES	-
dc.subject.other	Model Multiplicity	-
dc.subject.other	Predictive Multiplicity	-
dc.subject.other	Neural Network	-
dc.subject.other	Machine Learning	-
dc.subject.other	Deep Learning	-
dc.subject.other	Supervised Learning	-
dc.subject.other	Artificial Intelligence	-
dc.title	DIVERSE: Disagreement-Inducing Vector Evolution for Rashomon Set Exploration	-
dc.type	Proceedings Paper	-
local.bibliographicCitation.conferencedate	2026, April 23-27	-
local.bibliographicCitation.conferencename	The Fourteenth International Conference on Learning Representations (ICLR)	-
local.bibliographicCitation.conferenceplace	Rio de Janeiro, Brazil	-
local.format.pages	20	-
local.bibliographicCitation.jcat	C2	-
local.publisher.place	Online (OpenReview.net)	-
dc.relation.references	Akimoto, Y., & Hansen, N. (2019). Diagonal acceleration for covariance matrix adaptation evolution strategies. CoRR, abs/1905.05885. http://arxiv.org/abs/1905.05885 Auger, A., Hansen, N., Pérez Zerpa, J., Ros, R., & Schoenauer, M. (2009). Experimental comparisons of derivative free optimization algorithms. 3–15. https://doi.org/10.1007/978-3-642-02011-7_3 Birnbaum, S., Kuleshov, V., Enam, Z., Koh, P. W. W., & Ermon, S. (2019). Temporal FiLM: Capturing long-range sequence dependencies with feature-wise modulations. In H. Wallach, H. Larochelle, A. Beygelzimer, F. dAlché-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 32). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2019/file/2afc4dfb14e55c6face649a1d0c1025b-Paper.pdf Black, E., Raghavan, M., & Barocas, S. (2022). Model multiplicity: Opportunities, concerns, and solutions. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22, 850–863. https://doi.org/10.1145/3531146.3533149 Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science, 16(3), 199–231. https://doi.org/10.1214/ss/1009213726 Dai, G., Ravishankar, P., Yuan, R., Neill, D. B., & Black, E. (2025). Be intentional about fairness!: Fairness, size, and multiplicity in the rashomon set. arXiv. https://doi.org/10.48550/ARXIV.2501.15634 Devroye, L., Györfi, L., & Lugosi, G. (1996). A probabilistic theory of pattern recognition. Springer New York. https://doi.org/10.1007/978-1-4612-0711-5 Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. https://arxiv.org/abs/2010.11929 Eerlings, G., Vanbrabant, S., Liesenborgs, J., Rovelo Ruiz, G., Vanacken, D., & Luyten, K. (2025). AI-spectra: A visual dashboard for model multiplicity to enhance informed and transparent decision-making. Engineering Interactive Computer Systems. EICS 2024 International Workshops, 55–73. Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20(177), 1–81. Ganesh, P. (2024). An empirical investigation into benchmarking model multiplicity for trustworthy machine learning: A case study on image classification. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 4488–4497. Hansen, N. (2006). The CMA evolution strategy: A comparing review. In J. A. Lozano, P. Larrañaga, I. Inza, & E. Bengoetxea (Eds.), Towards a new evolutionary computation: Advances in the estimation of distribution algorithms (pp. 75–102). Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-32494-1_4 Hansen, N. (2016). The CMA evolution strategy: A tutorial. CoRR, abs/1604.00772. http://arxiv.org/abs/1604.00772 Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844. https://doi.org/10.1109/34.709601 Hsu, H., & Calmon, F. (2022). Rashomon capacity: A metric for predictive multiplicity in classification. In A. H. Oh, A. Agarwal, D. Belgrave, & K. Cho (Eds.), Advances in neural information processing systems. https://openreview.net/forum?id=9XWHdVCynhp Hsu, H., Li, G., Hu, S., & Chen, C.-F. (2024). Dropout-based rashomon set exploration for efficient predictive multiplicity estimation. The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=Sf2A2PUXO3 Jain, S., Wang, M., Creel, K., & Wilson, A. (2025). Allocation multiplicity: Evaluating the promises of the rashomon set. FAccT ’25, 2040–2055. https://doi.org/10.1145/3715275.3732138 Krizhevsky, A. (2009). Learning multiple layers of features from tiny images. Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79–86. https://doi.org/10.1214/aoms/1177729694 Kuncheva, L. I., & Whitaker, C. J. (2003). Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2), 181–207. https://doi.org/10.1023/a:1022859003006 LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. Lin, J. (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37(1), 145–151. https://doi.org/10.1109/18.61115 Marx, C., Calmon, F., & Ustun, B. (2020). Predictive multiplicity in classification. In H. D. III & A. Singh (Eds.), Proceedings of the 37th international conference on machine learning (Vol. 119, pp. 6765–6774). PMLR. https://proceedings.mlr.press/v119/marx20a.html Meyer, A. P., Kim, Y.-S., D’Antoni, L., & Albarghouthi, A. (2025). Perceptions of the fairness impacts of multiplicity in machine learning. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, Chi ’25. https://doi.org/10.1145/3706598.3713524 Omidvar, M. N., & Li, X. (2011). A comparative study of CMA-ES on large scale global optimisation. In J. Li (Ed.), AI 2010: Advances in artificial intelligence (pp. 303–312). Springer Berlin Heidelberg. Perez, E., Strub, F., De Vries, H., Dumoulin, V., & Courville, A. (2018). FiLM: Visual reasoning with a general conditioning layer (No. 1). Proceedings of the AAAI Conference on Artificial Intelligence, 32, Article 1. https://doi.org/10.1609/aaai.v32i1.11671 Pinsker, M. S. (1964). Information and information stability of random variables and processes. Holden-Day. Semenova, L., Chen, H., Parr, R., & Rudin, C. (2023). A path to simpler models starts with noise. Thirty-Seventh Conference on Neural Information Processing Systems. https://openreview.net/forum?id=Uzi22WryyX Semenova, L., Rudin, C., & Parr, R. (2022). On the existence of simpler machine learning models. 2022 ACM Conference on Fairness Accountability and Transparency, FAccT ’22, 1827–1858. https://doi.org/10.1145/3531146.3533232 Skalak, D. B. (1996). The sources of increased accuracy for two proposed boosting algorithms. AAAI Conference on Artificial Intelligence. Thomas M. Cover, J. A. T. (2005). Elements of information theory (pp. 13–55, 347–408). John Wiley and Sons, Ltd. https://doi.org/10.1002/047174882X Tseng, H.-Y., Lee, H.-Y., Huang, J.-B., & Yang, M.-H. (2020). Cross-domain few-shot classification via learned feature-wise transformation. International Conference on Learning Representations. https://openreview.net/forum?id=SJl5Np4tPr Turkoglu, M. O., Becker, A., Gündüz, H. A., Rezaei, M., Bischl, B., Daudt, R. C., D’Aronco, S., Wegner, J. D., & Schindler, K. (2022). FiLM-ensemble: Probabilistic deep learning via feature-wise linear modulation. Proceedings of the 36th International Conference on Neural Information Processing Systems, Nips ’22. Vapnik, V. N. (1995). The nature of statistical learning theory. Springer-Verlag. Watson-Daniels, J., Parkes, D. C., & Ustun, B. (2023). Predictive multiplicity in probabilistic classification. Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI’23/IAAI’23/EAAI’23. https://doi.org/10.1609/aaai.v37i9.26227 Xin, R., Zhong, C., Chen, Z., Takagi, T., Seltzer, M., & Rudin, C. (2022). Exploring the whole rashomon set of sparse decision trees. Proceedings of the 36th International Conference on Neural Information Processing Systems, Nips ’22. Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., Pfister, H., & Ni, B. (2023). MedMNIST v2—A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data, 10(1). https://doi.org/10.1038/s41597-022-01721-8 Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2017). Understanding deep learning requires rethinking generalization. International Conference on Learning Representations. https://openreview.net/forum?id=Sy8gdB9xx Zhong, C., Chen, Z., Liu, J., Seltzer, M., & Rudin, C. (2023). Exploring and interacting with the set of good sparse generalized additive models. Proceedings of the 37th International Conference on Neural Information Processing Systems, Nips ’23. Zhou, K., Yang, Y., Qiao, Y., & Xiang, T. (2021). Domain generalization with MixStyle. International Conference on Learning Representations. https://openreview.net/forum?id=6xHJ37MVxxp	-
local.type.refereed	Refereed	-
local.type.specified	Proceedings Paper	-
local.bibliographicCitation.artnr	18990	-
local.type.programme	VSC	-
dc.identifier.url	https://openreview.net/forum?id=kQjSUHC84V	-
local.bibliographicCitation.btitle	Proceedingsbook The Fourteenth International Conference on Learning Representations	-
local.uhasselt.international	no	-
item.contributor	EERLINGS, Gilles	-
item.contributor	ZOOMERS, Brent	-
item.contributor	LIESENBORGS, Jori	-
item.contributor	ROVELO RUIZ, Gustavo	-
item.contributor	LUYTEN, Kris	-
item.fullcitation	EERLINGS, Gilles; ZOOMERS, Brent; LIESENBORGS, Jori; ROVELO RUIZ, Gustavo & LUYTEN, Kris (2026) DIVERSE: Disagreement-Inducing Vector Evolution for Rashomon Set Exploration. In: Proceedingsbook The Fourteenth International Conference on Learning Representations, OpenReview.net, (Art N° 18990).	-
item.fulltext	With Fulltext	-
item.accessRights	Open Access	-
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
DIVERSE_reviews.txt	Proof of peer review	13.84 kB	Text	View/Open
DIVERSE_published.pdf	Published version	853.62 kB	Adobe PDF	View/Open

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM