Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "grouping"

Sort by: Order: Results:

  • Kramar, Vladimir (2022)
    This work presents a novel concept of categorising failures within test logs using string similarity algorithms. The concept was implemented in the form of a tool that went through three major iterations to its final version. These iterations are the following: 1) utilising two state-of-the-art log parsing algorithms, 2) manual log parsing of the Pytest testing framework, and 3) parsing of .xml files produced by the Pytest testing framework. The unstructured test logs were automatically converted into a structured format using the three approaches. Then, structured data was compared using five different string similarity algorithms, Sequence Matcher, Jaccard index, Jaro-Winkler distance, cosine similarity and Levenshtein ratio, to form the clusters. The results from each approach were implemented and validated across three different data sets. The concept was validated by implementing an open-sourced Test Failure Analysis (TFA) tool. The validation phase revealed the best implementation approach (approach 3) and the best string similarity algorithm for this task (cosine similarity). Lastly, the tool was deployed into an open-source project’s CI pipeline. Results of this integration, application and usage are reported. The achieved tool significantly reduces software engineers’ manual work and error-prone work by utilising cosine similarity as a similarity score to form clusters of failures.