Process mining refers to the discovery, conformance and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes possible to detect deviations, predict delays, support decision making and recommend process redesigns. Event logs are data sets containing the executions (called traces) of a business process. Several process mining algorithms have been defined to mine event logs and deliver valuable models (e.g. Petri nets) of how logged processes are being executed. However, they often generate spaghetti-like process models, which can be hard to understand. This is caused by the inherent complexity of real-life processes, which tend to be less structured and more flexible than what the stakeholders typically expect. In particular, spaghetti-like process models are discovered when all possible behaviors are shown in a single model as a result of considering the set of traces in the event log all at once. To minimize this problem, trace clustering can be used as a preprocessing step. It splits up an event log into clusters of similar traces, so as to handle variability in the recorded behavior and facilitate process model discovery. In this paper, we investigate a multiple view aware approach to trace clustering, based on a co-training strategy. In an assessment, using benchmark event logs, we show that the presented algorithm is able to discover a clustering pattern of the log, such that related traces result appropriately clustered. We evaluate the significance of the formed clusters using established machine learning and process mining metrics.

A Co-training Strategy for Multiple View Clustering in Process Mining

APPICE, ANNALISA
;
MALERBA, Donato
2016-01-01

Abstract

Process mining refers to the discovery, conformance and enhancement of process models from event logs currently produced by several information systems (e.g. workflow management systems). By tightly coupling event logs and process models, process mining makes possible to detect deviations, predict delays, support decision making and recommend process redesigns. Event logs are data sets containing the executions (called traces) of a business process. Several process mining algorithms have been defined to mine event logs and deliver valuable models (e.g. Petri nets) of how logged processes are being executed. However, they often generate spaghetti-like process models, which can be hard to understand. This is caused by the inherent complexity of real-life processes, which tend to be less structured and more flexible than what the stakeholders typically expect. In particular, spaghetti-like process models are discovered when all possible behaviors are shown in a single model as a result of considering the set of traces in the event log all at once. To minimize this problem, trace clustering can be used as a preprocessing step. It splits up an event log into clusters of similar traces, so as to handle variability in the recorded behavior and facilitate process model discovery. In this paper, we investigate a multiple view aware approach to trace clustering, based on a co-training strategy. In an assessment, using benchmark event logs, we show that the presented algorithm is able to discover a clustering pattern of the log, such that related traces result appropriately clustered. We evaluate the significance of the formed clusters using established machine learning and process mining metrics.
File in questo prodotto:
File Dimensione Formato  
A Co-Training.pdf

non disponibili

Tipologia: Documento in Versione Editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 844.53 kB
Formato Adobe PDF
844.53 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/159392
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 75
  • ???jsp.display-item.citation.isi??? 46
social impact