An Activity Instance Based Hierarchical Framework for Event Abstraction

Chiao-Yun Li, Sebastiaan J. van Zelst and Wil van der Aalst


Process mining allows one to analyze and extract knowledge from event data, i.e., records of process executions stored in information systems. Most process mining techniques are directly applied to the data as recorded in the system. Applying automated process discovery techniques, i.e., a core process mining technology, directly on such data yields complex process models describing millions of different execution paths. Other techniques applied to such discovered process models and system-level data, e.g., Read more

An Intent-Based Natural Language Interface for Querying Process Execution Data

Meriana Kobeissi, Nour Assy, Walid Gaaloul, Bruno Defude and Bassem Haidar


Process mining techniques allow organizations to discover, monitor and improve their as-is processes by analyzing the process execution data, aka event data, recorded by their information systems. A recurrent task in process mining is querying. Querying allows users to get insights into specific executions of their processes and to retrieve relevant data. Existing process querying techniques require end users to be knowledgeable of the query language and the database schema. Read more

Event Log Construction from Customer Service Conversations Using Natural Language Inference

Christoph Kecht, Andreas Egger, Wolfgang Kratsch and Maximilian Röglinger


A fundamental requirement for the successful application of process mining are event logs of high data quality that can be constructed from structured data stored in organizations’ core information systems. However, a substantial amount of data is processed outside these core systems, particularly in organizations doing consumer business with many customer interactions per day, which generate high amounts of unstructured text data. Although Natural Language Processing (NLP) and machine learning enable the exploitation of text data, these approaches remain challenging due to the required high amount of labeled training data. Read more

Towards Evidence-Based Analysis of Palliative Treatments for Stomach and Esophageal Cancer Patients: a Process Mining Approach

Pam Pijnenborg, Rob Verhoeven, Murat Firat, Hanneke van Laarhoven and Laura Genga


Stomach and esophageal cancer are in the top ten most common cancers worldwide, both with high mortality rate. Approximately one-third of these patients have metastases at initial diagnosis and should receive personalized palliative care to improve their remaining life time. However, there is a lack of consensus about personalized palliative care options. This often leads to difficulties in determining the right treatment pathway for individual patients. This study investigates the application of process mining techniques on palliative care pathways for stomach and esophageal cancer to obtain an evidence-based understanding of which palliative treatments are commonly carried out in clinical practice and how they are associated with patients’ survival time. Read more

Precision and Fitness in Object-Centric Process Mining

Jan Niklas Adams and Wil van der Aalst


Traditional process mining considers only one single case notion and discovers and analyzes models based on this. However, a single case notion is often not a realistic assumption in practice. Multiple case notions might interact and influence each other in a process. Object-centric process mining introduces the techniques and concepts to handle multiple case notions. So far, such event logs have been standardized and novel process model discovery techniques were proposed. However, notions for evaluating the quality of a model are missing. Read more

Bringing Rigor to the Qualitative Evaluation of Process Mining Findings: An Analysis and a Proposal

Jelmer Jan Koorn, Iris Beerepoot, Xixi Lu, Vinicius Stein Dani, Henrik Leopold, Inge van de Weerd and Hajo A. Reijers


Before the findings of a process mining project can be turned into actionable insights or recommendations, it is essential to make sure that the findings are actually valid. Therefore, the evaluation of the findings is a crucial part of a successful process mining project. Current process mining methodologies, however, fall short in providing actionable support to perform such an evaluation. Read more

DiCE4EL: Interpreting Process Predictions using a Milestone-Aware Counterfactual Approach

Chihcheng Hsieh, Catarina Moreira and Chun Ouyang


Predictive process analytics often apply machine learning to predict the future states of a running business process. However, the internal mechanisms of many existing predictive algorithms are opaque and a human decision-maker is unable to understand why a certain activity was predicted. Recently, counterfactuals have been proposed in the literature to derive human-understandable explanations from predictive models. Current counterfactual approaches consist of finding the minimum feature change that can make a certain prediction flip its outcome. Read more

Realizing A Digital Twin of An Organization Using Action-oriented Process Mining

Gyunam Park and Wil M. P. van der Aalst


A Digital Twin of an Organization (DTO) is a mirrored representation of an organization, aiming to improve the business process of the organization by providing a transparent view over the process and automating management actions to deal with existing and potential risks. Unlike wide applications of digital twins to product design and predictive maintenance, no concrete realizations of DTOs for business process improvement have been studied. In this work, we aim to realize DTOs using action-oriented process mining, a collection of techniques to evaluate violations of constraints and produce the required actions. Read more

Prescriptive Process Monitoring for Cost-Aware Cycle Time Reduction

Zahra Dasht Bozorgi, Irene Teinemaa, Marlon Dumas, Marcello La Rosa and Artem Polyvyanyy


Reducing cycle time is a recurrent concern in the field of business process management. Depending on the process, various interventions may be triggered to reduce the cycle time of a case, for example, using a faster shipping service in an order-to-delivery process or calling a customer to obtain missing information rather than waiting passively. However, each of these interventions comes with a cost. This paper tackles the problem of determining if and when to trigger a time-reducing intervention in a way that maximizes a net gain function. Read more

FOX: a neuro-Fuzzy model for process Outcome prediction and eXplanation

Vincenzo Pasquadibisceglie, Giovanna Castellano, Annalisa Appice and Donato Malerba


Predictive process monitoring (PPM) techniques have become a key element in both public and private organizations by enabling crucial operational support of their business processes. Thanks to the availability of large amounts of data, different solutions based on machine and deep learning have been proposed in the literature for the monitoring of process instances. These state-of-the-art approaches leverage accuracy as main objective of the predictive modeling, while they often neglect the interpretability of the model. Read more

Mine Me but Don’t Single Me Out: Differentially Private Event Logs for Process Mining

Gamal Elkoumy, Alisa Pankova and Marlon Dumas


The applicability of process mining techniques hinges on the availability of event logs capturing the execution of a business process. In some use cases, particularly those involving customer-facing processes, these event logs may contain private information. Data protection regulations restrict the use of such event logs for analysis purposes. One way of circumventing these restrictions is to anonymize the event log to the extent that no individual can be singled out using the anonymized log. Read more

SaCoFa: Semantics-aware Control-flow Anonymization for Process Mining

Stephan Fahrenkrog-Petersen, Martin Kabierski, Fabian Rösel, Han van der Aa and Matthias Weidlich


Privacy-preserving process mining enables the analysis of business processes using event logs, while giving guarantees on the protection of sensitive information on process stakeholders. To this end, existing approaches add noise to the results of queries that extract properties of an event log, such as the frequency distribution of trace variants, for analysis. Noise insertion neglects the semantics of the process, though, and may generate traces not present in the original log. Read more

Sampling What Matters: Relevance-guided Sampling of Event Logs

Martin Kabierski, Hoang Lam Nguyen, Lars Grunske and Matthias Weidlich


The comparison of a model of a process against event data recorded during its execution, known as conformance checking, is an important means in process analysis. Yet, common conformance checking techniques are computationally expensive, which makes a complete analysis infeasible for large logs. To mitigate this problem, existing techniques leverage data samples. Then, the result quality depends on the relevance of the sample for a specific analysis task. Existing sampling strategies therefore rely on a static assumption on what constitutes relevant event data, which is generally unknown a priori. Read more

Selecting Representative Sample Traces from Large Event Logs

Gaël Bernard and Periklis Andritsos


When event logs are large, the time needed to analyze them using process mining techniques can become prohibitive. In this paper, using sampling, we aim to reduce the size of event logs to p-traces, while minimizing the Earth Movers’ Distance (EMD) from the unsampled original event log. We contribute by formalizing log sampling in a canonical form and show its link with the EMD, a metric increasingly used for process mining. Next, we propose three log-sampling algorithms that we evaluate using a collection of 18 event logs from industry. Read more

Discovering Declarative Process Model Behavior from Event Logs via Model Learning

Simone Agostinelli, Giacomo Bergami, Alessio Fiorenza, Fabrizio Maria Maggi, Andrea Marrella and Fabio Patrizi


Declarative business process (BP) models define the behavior of BPs as a set of temporal constraints, which can be summarized as a deterministic finite state automaton (DFA). Declarative BP discovery aims at inferring such constraints from event logs. To this aim, it requires as additional input the set of candidate constraints to be verified with respect to the event log. Intuitively, this restricts the discovery task to a conformance checking activity between a predefined set of constraint templates and an event log, preventing to learn any observed behavior that is not captured by those templates. Read more

Process Discovery using Graph Neural Networks

Dominique Sommers, Vlado Menkovski and Dirk Fahland


Automatically discovering a process model from an event log is the prime problem in process mining. This task is so far approached as an unsupervised learning problem through graph synthesis algorithms. Algorithmic design decisions and heuristics allow for efficiently finding models in a reduced search space. However, design decisions and heuristics are derived from assumptions about how a given behavioral description — an event log — translates into a process model and were not learned from actual models which introduce biases in the solutions. Read more

Striking a new Balance in Accuracy and Simplicity with the Probabilistic Inductive Miner

Dennis Brons, Roeland Scheepens and Dirk Fahland


Numerous process discovery techniques exist for generating process models that describe recorded executions of business processes. The models are meant to generalize executions into human-understandable modeling patterns, notably parallelism, and enable rigorous analysis of process deviations. However, well-defined models with parallelism returned by existing techniques are often too complex or generalize the recorded behavior too strongly to be trusted in a practical business context. We bridge this gap by introducing the Probabilistic Inductive Miner (PIM) based on the Inductive Miner framework. Read more

An A*-Algorithm for Computing Discounted Anti-Alignments in Process Mining

Mathilde Boltenhagen, Thomas Chatain and Josep Carmona


Process mining techniques aim at analyzing and monitoring processes through event data. Formal models like Petri nets serve as an effective representation of the processes. A central question in the field is to assess the conformance of a process model with respect to the real process executions. The notion of anti-alignment, which represents a model run that is as distant as possible to the process executions, has been demonstrated to be crucial to measure precision of models. Read more

Partial MaxSAT computation of Conformance Checking Artefacts

Jesus Ojeda


To reason about observed behaviour of processes and their models, conformance checking techniques are rooted in the computation of artefacts. Related artefacts like alignments, multi-alignments and anti-alignments are defined over a distance function, most commonly Hamming or Levenshtein distances. In this paper we provide a new Partial MaxSAT encoding of these artefacts based on the Levenshtein distance and compare with their current state-of-the-art SAT encodings. We show a reduction in the resulting formula size for our proposed encoding, while also obtaining good performance results on the computation of the artefacts. Read more

Probabilistic Trace Alignment

Giacomo Bergami, Fabrizio Maria Maggi, Marco Montali and Rafael Peñaloza


Alignments provide sophisticated diagnostics that pinpoint deviations in a trace with respect to a process model. Alignment-based approaches for conformance checking have so far used crisp process models as a reference. Recent probabilistic conformance checking approaches check the degree of conformance of an event log as a whole with respect to a stochastic process model, without providing alignments. For the first time, we introduce a conformance checking approach based on trace alignments using stochastic Workflow nets. Read more

Efficient Approximate Conformance Checking Using Trie Data Structure

Ahmed Awad, Kristo Raun and Matthias Weidlich


Conformance checking compares a process model and recorded executions of a process, i.e., a log of traces. To this end, state-of-the-art approaches compute an alignment between a trace and an execution sequence of the model. Since the construction of alignments is computationally expensive, approximation schemes have been developed to strike a balance between the efficiency and the accuracy of conformance checking. Specifically, conformance checking may rely only on so-called proxy behavior, a subset of the behavior of the model. Read more