Skip to main content
Login | Suomeksi | På svenska | In English

Discovering topics in Slack message streams

Show full item record

Title: Discovering topics in Slack message streams
Author(s): Aalto, Iiro
Contributor: University of Helsinki, Faculty of Science, Tietojenkäsittelytieteen osasto
Discipline: Tietojenkäsittelytiede
Language: English
Acceptance year: 2020
Slack is an instant messaging platform intended for the internal communications of companies and other organizations. For organizations that use Slack extensively it may provide an interesting source of insight, but as such the data is difficult to analyze. Topic modeling, primarily latent Dirichlet allocation (LDA), is commonly used to summarize textual data in a meaningful way. Instant messages tend to be very short, which causes problems for conventional topic modeling methods such as LDA. The data sparsity problem can be tackled with data expansion and data combination techniques. For instant messages, data combination is particularly attractive as the messages are not independent of each other, but form implicit, and sometimes expicit, threads as the participants reply to each other. Most of the threads in the Slack data are not explicit, but must be ’untangled’ from the message stream if they are to be used as a basis for a data combination scheme. In this thesis we study the possibility of detecting implicit threads from a slack message stream and leveraging the threads as a data combination scheme in topic modeling. The threads are detected using a hierarchical clustering algorithm which uses word mover’s distance, latent semantic analysis, and metadata to compute the distances between messages. The clusters are then concatenated and used as the input for LDA. It is shown that on a dataset gathered from the Gofore Oyj Slack workspace, the cluster-based model improves on the message-based model, but falls short of being practical.
Keyword(s): aihemallinnus topic modelling text clustering Slack

Files in this item

Files Size Format View
iiro-aalto-thesis-final.pdf 1.412Mb PDF

This item appears in the following Collection(s)

Show full item record