Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/45193
Title: Getting the Data in Shape for Your Process Mining Analysis: An In-Depth Analysis of the Pre-Analysis Stage
Authors: PRADHAN, Shameer 
JANS, Mieke 
MARTIN, Niels 
Issue Date: 2025
Publisher: 
Source: ACM computing surveys, 57(6), (Art N° 159)
Abstract: Process mining enables organizations to analyze the data stored in their information systems and derive insights regarding their business processes. However, raw data needs to be converted into a format that can be fed into process mining algorithms. Various pre-analysis activities can be performed on the raw data, such as imperfection removal or granularity level change. Although pre-analysis activities play a crucial role in process mining, there is currently a limited overview available regarding their scope and the extent of their examination. This study presents a systematic literature review of the pre-analysis activities in process mining projects. To better understand this stage and its current state of research, we explore which activities constitute the pre-analysis stage, their goals, the applied research methodologies, the proposed research outcomes, and the data used to evaluate the research outcomes. We identify 15 pre-analysis activities and concepts, e.g., data extraction, generation, and cleaning. We also discover that design science research is the methodology and methods that are the primary research outcome in previous studies. We also realize that the proposed outcomes have been evaluated using only real-life data most of the time. This study reveals that research on pre-analysis is a growing ield of interest in process mining.
Keywords: CCS Concepts: · Applied computing → Business process monitoring Additional Key Words and Phrases: process mining;process mining pre-analysis;data preprocessing;event log
Document URI: http://hdl.handle.net/1942/45193
ISSN: 0360-0300
e-ISSN: 1557-7341
DOI: 10.1145/3712587
ISI #: 001455857200003
Datasets of the publication: https://zenodo.org/records/14623497
Rights: Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Category: A1
Type: Journal Contribution
Appears in Collections:Research publications

Files in This Item:
File Description SizeFormat 
3712587.pdf
  Until 2025-08-20
Peer-reviewed author version671.5 kBAdobe PDFView/Open    Request a copy
3712587.pdf
  Restricted Access
Published version1.08 MBAdobe PDFView/Open    Request a copy
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.