Striking a new Balance in Accuracy and Simplicity with the Probabilistic Inductive Miner

Dennis Brons, Roeland Scheepens and Dirk Fahland


Numerous process discovery techniques exist for generating process models that describe recorded executions of business processes. The models are meant to generalize executions into human-understandable modeling patterns, notably parallelism, and enable rigorous analysis of process deviations. However, well-defined models with parallelism returned by existing techniques are often too complex or generalize the recorded behavior too strongly to be trusted in a practical business context. We bridge this gap by introducing the Probabilistic Inductive Miner (PIM) based on the Inductive Miner framework. PIM compares in each step the most probable operators and structures based on frequency information in the data, which results in block-structured models with significantly higher accuracy. All design choices in PIM are based on business context requirements obtained through a user study with industrial process mining experts. PIM is evaluated quantitatively and in an novel kind of empirical study comparing users’ trust in discovered model structures. The evaluations show that PIM strikes a unique trade-off between model accuracy and model complexity, that is conclusively preferred by users over all state-of-the-art process discovery methods.