Expert-driven trace clustering
=============================================

Improving the quality of trace clustering with expert knowledge
About ----- Trace clustering techniques are a set of approaches for partitioning traces or process instances into similar groups. Typically, this partitioning is based on certain patterns or similarity between the traces, or done by discovering a process model for each cluster of traces. In general, however, it is likely that clustering solutions obtained by these approaches will be hard to understand or difficult to validate given an expert's domain knowledge. Therefore, we propose a novel semi-supervised trace clustering technique based on the knowledge of an expert. References ---------- This algorithm is described in the following publication: * De Koninck, P., Nelissen, K., Baesens, B., Snoeck, M., & De Weerdt, J. (2021). Expert-driven trace clustering with instance-level constraints. Knowledge and Information Systems, 63(5), 1197-1220. Implementation -------------- The algorithm is implemented as the "ActSemiSup From Full Clustering"-plugin for ProM 6. It is included in the package ExpertTraceClustering, which can be found in the nightly builds. For more information and source code, see [the ProM repository](https://svn.win.tue.nl/repos/prom/Packages/ExpertTraceClustering/). Contact ------- Contact the authors at: * [Pieter De Koninck](mailto:pieter.dekoninck@kuleuven.be) (corresponding author)
Department of Decision Sciences and Information Management, KU Leuven
Naamsestraat 69, B-3000 Leuven, Belgium * [Klaas Nelissen](mailto:klaas.nelissen@kuleuven.be)
Department of Decision Sciences and Information Management, KU Leuven
Naamsestraat 69, B-3000 Leuven, Belgium * [Bart Baesens](mailto:bart.baesens@kuleuven.be)
Department of Decision Sciences and Information Management, KU Leuven
Naamsestraat 69, B-3000 Leuven, Belgium * [Seppe vanden Broucke](mailto:seppe.vandenbroucke@kuleuven.be)
Department of Decision Sciences and Information Management, KU Leuven
Naamsestraat 69, B-3000 Leuven, Belgium * [Monique Snoeck](mailto:monique.snoeck@kuleuven.be)
Department of Decision Sciences and Information Management, KU Leuven
Naamsestraat 69, B-3000 Leuven, Belgium * [Jochen De Weerdt](mailto:jochen.deweerdt@kuleuven.be)
Department of Decision Sciences and Information Management, KU Leuven
Naamsestraat 69, B-3000 Leuven, Belgium Acknowledgements ---------------- This research has been financed in part by the EC H2020 MSCA RISE [NeEDS](https://riseneeds.eu/) Project (Grant agreement ID: 822214). Screenshots ----------- [](#i00) [](#i01) [](#i02) [](#i03)