Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/48428
Title: Predicting the Tendency Toward Open Science in Flemish Research Projects
Authors: PHAM, Hoàng Son 
ALI ELDIN, Amr 
Issue Date: 2025
Publisher: Zenodo
Source: Open Science Network Day 2025, Brussels, 2025, May 22
Abstract: This paper presents a work in progress and a novel machine learning-based approach to assess and predict the tendency to support open-access within research projects as a feature of Open Science support. INTRODUCTION Approach 1. Apply Predictive Machine Learning LLM techniques like ChatGPT and LLaMA were applied to predict OA support of research project. Given project abstract, disciplines, funder, etc. we asked LLM how likely is it that the resulting publications will be open access? RESULTS ML model performance In this work, the tendency to OA support of a project is represented by the percentage of open-access articles expected as outcome of that project. OA support is expected to be influenced by various project-related factors, such as open-access journal articles authored by project participants (researchers or organizations), the funding source, the associated research disciplines, and the interdisciplinarity of the project. Analyzing key indicators such as publication practices, funding sources, research disciplines, and interdisciplinarity, we develop predictive models that identify open-access support level. Step 2: Train Machine Learning Model to predict open-access support Step 1: Develop feature creation algorithm to generate features for regression models FUNDER ORGANIZATION Open-access Percentage LLM results LLM Prompt Template OA score Project information • Abstract • Disciplines • Interdisciplinarity • Author names • OA of authors' publications • Organization name • OA of organization's publications • Funder LLama3 Model performance-Precision: 76%-Recall: 83%-F1-score: 78%-Accuracy: 83% CONCLUSION Approach 2. Apply Large Language Model PROMPT TEMPLATE RESPONSE EXAMPLE We developed and evaluated machine learning and large language models to predict the open access (OA) support level of research projects. Both approaches demonstrated strong performance, achieving an accuracy of 83%-85% on our evaluation dataset. These results highlight the potential of AI-driven methods to support open science monitoring and decision-making. In future work, we aim to enhance model interpretability and generalizability by incorporating more diverse datasets, expanding feature sets (e.g., author OA history, funding policies), and applying the models across different funding agencies and research domains. Acknowledgment: This study was supported by The Expertise Center for
Document URI: http://hdl.handle.net/1942/48428
DOI: 10.5281/zenodo.15727141
Category: C2
Type: Conference Material
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
UHasselt_poster_OpenScienceNetworkDay2025.pdfConference material704.88 kBAdobe PDFView/Open
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.