Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "deep learning"

Sort by: Order: Results:

  • Trizna, Dmitrijs (2022)
    The detection heuristic in contemporary machine learning Windows malware classifiers is typically based on the static properties of the sample. In contrast, simultaneous utilization of static and behavioral telemetry is vaguely explored. We propose a hybrid model that employs dynamic malware analysis techniques, contextual information as an executable filesystem path on the system, and static representations used in modern state-of-the-art detectors. It does not require an operating system virtualization platform. Instead, it relies on kernel emulation for dynamic analysis. Our model reports enhanced detection heuristic and identify malicious samples, even if none of the separate models express high confidence in categorizing the file as malevolent. For instance, given the $0.05\%$ false positive rate, individual static, dynamic, and contextual model detection rates are $18.04\%$, $37.20\%$, and $15.66\%$. However, we show that composite processing of all three achieves a detection rate of $96.54\%$, above the cumulative performance of individual components. Moreover, simultaneous use of distinct malware analysis techniques address independent unit weaknesses, minimizing false positives and increasing adversarial robustness. Our experiments show a decrease in contemporary adversarial attack evasion rates from $26.06\%$ to $0.35\%$ when behavioral and contextual representations of sample are employed in detection heuristic.
  • Mylläri, Juha (2022)
    Anomaly detection in images is the machine learning task of classifying inputs as normal or anomalous. Anomaly localization is the related task of segmenting input images into normal and anomalous regions. The output of an anomaly localization model is a 2D array, called an anomaly map, of pixel-level anomaly scores. For example, an anomaly localization model trained on images of non-defective industrial products should output high anomaly scores in image regions corresponding to visible defects. In unsupervised anomaly localization the model is trained solely on normal data, i.e. without labelled training observations that contain anomalies. This is often necessary as anomalous observations may be hard to obtain in sufficient quantities and labelling them is time-consuming and costly. Student-teacher feature pyramid matching (STFPM) is a recent and powerful method for unsupervised anomaly detection and localization that uses a pair of convolutional neural networks of identical architecture. In this thesis we propose two methods of augmenting STFPM to produce better segmentations. Our first method, discrepancy scaling, significantly improves the segmentation performance of STFPM by leveraging pre-calculated statistics containing information about the model’s behaviour on normal data. Our second method, student-teacher model assisted segmentation, uses a frozen STFPM model as a feature detector for a segmentation model which is then trained on data with artificially generated anomalies. Using this second method we are able to produce sharper anomaly maps for which it is easier to set a threshold value that produces good segmentations. Finally, we propose the concept of expected goodness of segmentation, a way of assessing the performance of unsupervised anomaly localization models that, in contrast to current metrics, explicitly takes into account the fact that a segmentation threshold needs to be set. Our primary method, discrepancy scaling, improves segmentation AUROC on the MVTec AD dataset over the base model by 13%, measured in the shrinkage of the residual (1.0 − AUROC). On the image-level anomaly detection task, a variant of the discrepancy scaling method improves performance by 12%.
  • Koppatz, Maximilian (2022)
    Automatic headline generation has the potential to significantly assist editors charged with head- lining articles. Approaches to automation in the headlining process can range from tools as creative aids, to complete end to end automation. The latter is difficult to achieve as journalistic require- ments imposed on headlines must be met with little room for error, with the requirements depending on the news brand in question. This thesis investigates automatic headline generation in the context of the Finnish newsroom. The primary question I seek to answer is how well the current state of text generation using deep neural language models can be applied to the headlining process in Finnish news media. To answer this, I have implemented and pre-trained a Finnish generative language model based on the Transformer architecture. I have fine-tuned this language model for headline generation as autoregression of headlines conditioned on the article text. I have designed and implemented a variation of the Diverse Beam Search algorithm, with additional parameters, to perform the headline generation in order to generate a diverse set of headlines for a given text. The evaluation of the generative capabilities of this system was done with real world usage in mind. I asked domain-experts in headlining to evaluate a generated set of text-headline pairs. The task was to accept or reject the individual headlines in key criteria. The responses of this survey were then quantitatively and qualitatively analyzed. Based on the analysis and feedback, this model can already be useful as a creative aid in the newsroom despite being far from ready for automation. I have identified concrete improvement directions based on the most common types of errors, and this provides interesting future work.
  • Zhao, Zhao (2023)
    This thesis aims to offer a practical solution for making cost-effective decisions regarding weather routing deployment to optimize computational costs. The study focuses on developing three collaborative model components that collectively address the challenge of rerouting decision-making. Model 1 involves training a neural network-based Ship Performance Model, which forms the foundation for the weather routing model. Model 2 is centered around constructing a time-dependent path-finding model that integrates real-time weather forecasts. This model optimizes routing within a designated experimental area, generating simulation training samples. Model 3 utilizes the outcomes of Model 2 to train a practical machine learning decision-making model. This model seeks to address the question: should the weather routing system be activated and the route be adjusted based on updated weather forecasts? The integration of these models supports informed maritime decision-making. While these methods represent a preliminary step towards optimizing weather routing deployment frequencies, they hold the potential for enhancing operational efficiency and responsible resource usage in maritime sector.
  • Alcantara, Jose Carlos (2020)
    A recent machine learning technique called federated learning (Konecny, McMahan, et. al., 2016) offers a new paradigm for distributed learning. It consists of performing machine learning on multiple edge devices and simultaneously optimizing a global model for all of them, without transmitting user data. The goal for this thesis was to prove the benefits of applying federated learning to forecasting telecom key performance indicator (KPI) values from radio network cells. After performing experiments with different data sources' aggregations and comparing against a centralized learning model, the results revealed that a federated model can shorten the training time for modelling new radio cells. Moreover, the amount of transferred data to a central server is minimized drastically while keeping equivalent performance to a traditional centralized model. These experiments were performed with multi-layer perceptron as model architecture after comparing its performance against LSTM. Both, input and output data were sequences of KPI values.
  • Maljanen, Katri (2021)
    Cancer is a leading cause of death worldwide. Unlike its name would suggest, cancer is not a single disease. It is a group of diseases that arises from the expansion of a somatic cell clone. This expansion is thought to be a result of mutations that confer a selective advantage to the cell clone. These mutations that are advantageous to cells that result in their proliferation and escape of normal cell constraints are called driver mutations. The genes that contain driver mutations are known as driver genes. Studying these mutations and genes is important for understanding how cancer forms and evolves. Various methods have been developed that can discover these mutations and genes. This thesis focuses on a method called Deep Mutation Modelling, a deep learning based approach to predicting the probability of mutations. Deep Mutation Modelling’s output probabilities offer the possibility of creating sample and cancer type specific probability scores for mutations that reflect the pathogenicity of the mutations. Most methods in the past have made scores that are the same for all cancer types. Deep Mutation Modelling offers the opportunity to make a more personalised score. The main objectives of this thesis were to examine the Deep Mutation Modelling output as it was unknown what kind of features it has, see how the output compares against other scoring methods and how the probabilities work in mutation hotspots. Lastly, could the probabilities be used in a common driver gene discovery method. Overall, the goal was to see if Deep Mutation Modelling works and if it is competitive with other known methods. The findings indicate that Deep Mutation Modelling works in predicting driver mutations, but that it does not have sufficient power to do this reliably and requires further improvements.
  • Pajula, Ilari (2024)
    Combining data from visual and inertial sensors effectively reduces inherent errors in each modality, enhancing the robustness of sensor-fusion for accurate 6-DoF motion estimation over extended periods. While traditional SfM and SLAM frameworks are well established in literature and real-world applications, purely end-to-end learnable SfM and SLAM networks are still scarce. The adaptability of fully trained models in system configuration and navigation setup holds great potential for future developments in this field. This thesis introduces and assesses two novel end-to-end trainable sensor-fusion models using a supervised learning approach, tested on established navigation benchmarks and custom datasets. The first model utilizes optical flow, revealing its limitations in handling complex camera movements present in pedestrian motion. The second model addresses these shortcomings by using feature point-matching and a completely original design.
  • Tobaben, Marlon (2022)
    Using machine learning to improve health care has gained popularity. However, most research in machine learning for health has ignored privacy attacks against the models. Differential privacy (DP) is the state-of-the-art concept for protecting individuals' data from privacy attacks. Using optimization algorithms such as the DP stochastic gradient descent (DP-SGD), one can train deep learning models under DP guarantees. This thesis analyzes the impact of changes to the hyperparameters and the neural architecture on the utility/privacy tradeoff, the main tradeoff in DP, for models trained on the MIMIC-III dataset. The analyzed hyperparameters are the noise multiplier, clipping bound, and batch size. The experiments examine neural architecture changes regarding the depth and width of the model, activation functions, and group normalization. The thesis reports the impact of the individual changes independently of other factors using Bayesian optimization and thus overcomes the limitations of earlier work. For the analyzed models, the utility is more sensitive to changes to the clipping bound than to the other two hyperparameters. Furthermore, the privacy/utility tradeoff does not improve when allowing for more training runtime. The changes to the width and depth of the model have a higher impact than other modifications of the neural architecture. Finally, the thesis discusses the impact of the findings and limitations of the experiment design and recommends directions for future work.
  • Kurki, Lauri (2021)
    Atomic force microscopy (AFM) is a widely utilized characterization method capable of capturing atomic level detail in individual organic molecules. However, an AFM image contains relatively little information about the deeper atoms in a molecule and thus interpretation of AFM images of non-planar molecules offers significant challenges for human experts. An end-to-end solution starting from an AFM imaging system ending in an automated image interpreter would be a valuable asset for all research utilizing AFM. Machine learning has become a ubiquitous tool in all areas of science. Artificial neural networks (ANNs), a specific machine learning tool, have also arisen as a popular method many fields including medical imaging, self-driving cars and facial recognition systems. In recent years, progress towards interpreting AFM images from more complicated samples has been made utilizing ANNs. In this thesis, we aim to predict sample structures from AFM images by modeling the molecule as a graph and using a generative model to build the molecular structure atom-by-atom and bond-by-bond. The generative model uses two types of ANNs, a convolutional attention mechanism to process the AFM images and a graph neural network to process the generated molecule. The model is trained and tested using simulated AFM images. The results of the thesis show that the model has the capability to learn even slight details from complicated AFM images, especially when the model only adds a single atom to the molecule. However, there are challenges to overcome in the generative model for it to become a part of a fully capable end-to-end AFM process.
  • Vesalainen, Ari (2022)
    Digitization has changed history research. The materials are available, and online archives make it easier to find the correct information and speed up the search for information. The remaining challenge is how to use modern digital methods to analyze the text of historical documents in more detail. This is an active research topic in digital humanities and computer science areas. Document layout analysis is where computer vision object detection methods can be applied to historical documents to identify the document pages’ present objects (i.e., page elements). The recent development in deep learning based computer vision provides excellent tools for this purpose. However, most reviewed systems focus on coarse-grained methods, where only the high-level page elements are detected (e.g., text, figures, tables). Fine-grained detection methods are required to be able to analyze texts on a more detailed level; for example, footnotes and marginalia are distinguished from the body text to enable proper analysis. The thesis studies how image segmentation techniques can be used for fine-grained OCR document layout analysis. How to implement fine-grained page segmentation and region classification systems in practice, and what are the accuracy and the main challenges of such a system? The thesis includes implementing a layout analysis model that uses the instance segmentation method (Mask R-CNN). This implementation is compared against another existing layout analysis using the semantic segmentation method (U-net based P2PaLA implementation).
  • Barin Pacela, Vitória (2021)
    Independent Component Analysis (ICA) aims to separate the observed signals into their underlying independent components responsible for generating the observations. Most research in ICA has focused on continuous signals, while the methodology for binary and discrete signals is less developed. Yet, binary observations are equally present in various fields and applications, such as causal discovery, signal processing, and bioinformatics. In the last decade, Boolean OR and XOR mixtures have been shown to be identifiable by ICA, but such models suffer from limited expressivity, calling for new methods to solve the problem. In this thesis, "Independent Component Analysis for Binary Data", we estimate the mixing matrix of ICA from binary observations and an additionally observed auxiliary variable by employing a linear model inspired by the Identifiable Variational Autoencoder (iVAE), which exploits the non-stationarity of the data. The model is optimized with a gradient-based algorithm that uses second-order optimization with limited memory, resulting in a training time in the order of seconds for the particular study cases. We investigate which conditions can lead to the reconstruction of the mixing matrix, concluding that the method is able to identify the mixing matrix when the number of observed variables is greater than the number of sources. In such cases, the linear binary iVAE can reconstruct the mixing matrix up to order and scale indeterminacies, which are considered in the evaluation with the Mean Cosine Similarity Score. Furthermore, the model can reconstruct the mixing matrix even under a limited sample size. Therefore, this work demonstrates the potential for applications in real-world data and also offers a possibility to study and formalize identifiability in future work. In summary, the most important contributions of this thesis are the empirical study of the conditions that enable the mixing matrix reconstruction using the binary iVAE, and the empirical results on the performance and efficiency of the model. The latter was achieved through a new combination of existing methods, including modifications and simplifications of a linear binary iVAE model and the optimization of such a model under limited computational resources.
  • Niemi, Roope Oskari (2022)
    DeepRx is a deep learning receiver which replaces much of the functionality of a traditional 5G receiver. It is a deep model which uses residual connections and a fully convolutional architecture to process an incoming signal, and it outputs log-likelihood ratios for each bit. However, the deep model can be computationally too heavy to use in a real environment. Nokia Bell Labs has recently developed an iterative version of the DeepRx, where a model with fewer layers is used iteratively. This thesis focuses on developing a neural network which determines how many iterations the iterative DeepRx needs to use. We trained a separate neural network, the stopping condition neural network, which will be used together with the iterative model. It predicts the number of iterations the model requires to process the input correctly, with the aim that each inference uses as few iterations as possible. The model also stops the inference early if it predicts that the required number of iterations is greater than the maximum amount. Our results show that an iterative model with a stopping condition neural network has significantly fewer parameters than the deep model. The results also show that while the stopping condition neural network could predict with a high accuracy which samples could be decoded, using it also increased the uncoded bit error rate of the iterative model slightly. Therefore, using a stopping condition neural network together with an iterative model seems to be a flexible lightweight alternative to the DeepRx model.
  • Gierlach, Mateusz Tadeusz (2020)
    Visual fashion understanding (VFU) is a discipline which aims to solve tasks related to clothing recognition, such as garment categorization, garment’s attributes prediction or clothes retrieval, with the use of computer vision algorithms trained on fashion-related data. Having surveyed VFU- related scientific literature, I conclude that, because of the fact that at the heart of all VFU tasks is the same issue of visually understanding garments, those VFU tasks are in fact related. I present a hypothesis that building larger multi-task learning models dedicated to predicting multiple VFU tasks at once might lead to better generalization properties of VFU models. I assess the validity of my hypothesis by implementing two deep learning solutions dedicated primarily to category and attribute prediction. First solution uses multi-task learning concept of sharing features from ad- ditional branch dedicated to localization task of landmarks’ position prediction. Second solution does not share knowledge from localization branch. Comparison of those two implementations con- firmed my hypothesis, as sharing knowledge between tasks increased category prediction accuracy by 53% and attributes prediction recall by 149%. I conclude that multi-task learning improves generalization properties of deep learning-based visual fashion understanding models across tasks.
  • Kutvonen, Konsta (2020)
    With modern computer vision algorithms, it is possible to solve many different kinds of problems, such as object detection, image classification, and image segmentation. In some cases, like in the case of a camera-based self-driving car, the task can't yet be adequately solved as a direct mapping from image to action with a single model. In such situations, we need more complex systems that can solve multiple computer vision tasks to understand the environment and act based on it for acceptable results. Training each task on their own can be expensive in terms of storage required for all weights and especially for the inference time as the output of several large models is needed. Fortunately, many state-of-the-art solutions to these problems use Convolutional Neural Networks and often feature some ImageNet backbone in their architecture. With multi-task learning, we can combine some of the tasks into a single model, sharing the convolutional weights in the network. Sharing the weights allows for training smaller models that produce outputs faster and require less computational resources, which is essential, especially when the models are run on embedded devices with constrained computation capability and no ability to rely on the cloud. In this thesis, we will present some state-of-the-art models to solve image classification and object detection problems. We will define multi-task learning, how we can train multi-task models, and take a look at various multi-task models and how they exhibit the benefits of multi-task learning. Finally, to evaluate how training multi-task models changes the basic training paradigm and to find what issues arise, we will train multiple multi-task models. The models will mainly focus on image classification and object detection using various data sets. They will combine multiple tasks into a single model, and we will observe the impact of training the tasks in a multi-task setting.
  • Enwald, Joel (2020)
    Mammography is used as an early detection system for breast cancer, which is one of the most common types of cancer, regardless of one’s sex. Mammography uses specialised X-ray machines to look into the breast tissue for possible tumours. Due to the machine’s set-up as well as to reduce the radiation patients are exposed to, the number of X-ray measurements collected is very restricted. Reconstructing the tissue from this limited information is referred to as limited angle tomography. This is a complex mathematical problem and ordinarily leads to poor reconstruction results. The aim of this work is to investigate how well a neural network whose structure utilizes pre-existing models and known geometry of the problem performs at this task. In this preliminary work, we demonstrate the results on simulated two-dimensional phantoms and discuss the extension of the results to 3-dimensional patient data.
  • Rosenberg, Otto (2023)
    Bayesian networks (BN) are models that map the mutual dependencies and independencies between a set of variables. The structure of the model can be represented as a directed acyclic graph (DAG), which is a graph where the nodes represent variables and the directed edges between variables represent a dependency. BNs can be either constructed by using knowledge of the system or derived computationally from observational data. Traditionally, BN structure discovery from observational data has been done through heuristic algorithms, but advances in deep learning have made it possible to train neural networks for this task in a supervised manner. This thesis provides an overview of BN structure discovery and discusses the strengths and weaknesses of the emerging supervised paradigm. One supervised method, the EQ-model, that uses neural networks for structure discovery using equivariant models, is also explored in further detail with empirical tests. Through a process of hyperparameter optimisation and moving to online training, the performance of the EQ-model is increased. The EQ-model is still observed to underperform in comparison to a competing score-based model, NOTEARS, but offers convenient features, such as dramatically faster runtime, that compensate for the reduced performance. Several interesting lines of further study that could be used to further improve the performance of the EQ-model are also identified.
  • Zhao, Linzh (2024)
    As privacy gains consensus in the field of machine learning, numerous algorithms, such as differentially private stochastic gradient descent (DPSGD), have emerged to ensure privacy guarantees. Concurrently, fairness is garnering increasing attention, prompting research aimed at achieving fairness within the constraints of differential privacy. This thesis delves into algorithms designed to enhance fairness in the realm of differentially private deep learning and explores their mechanisms. It examines the role of normalization, a technique applied to these algorithms in practice, to elucidate its impact on fairness. Additionally, this thesis formalizes a hyperparameter tuning protocol to accurately assess the performance of these algorithms. Experiments across various datasets and neural network architectures were conducted to test our hypotheses under this tuning protocol. The decoupling of hyperparameters, allowing each to independently control specific properties of the algorithm, has proven to enhance performance. However, certain mechanisms, such as discarding samples with large norms and allowing unbounded hyperparameter adaptation, may significantly compromise fairness. Our experiments also confirm the critical role of hyperparameter values in influencing fairness, emphasizing the necessity of precise tuning to ensure equitable outcomes. Additionally, we observed differential convergence rates across algorithms, which affect the number of trials needed to identify optimal hyperparameter settings. This thesis aims to offer detailed perspectives on understanding fairness in differentially private deep learning and provides insights into designing algorithms that can more effectively enhance fairness.
  • Kivimäki, Juhani (2022)
    In this thesis, we give an overview of current methodology in the field of uncertainty estimation in machine learning, with focus on confidence scores and their calibration. We also present a case study, where we propose a novel method to improve uncertainty estimates of an in-production machine learning model operating in an industrial setting with real-life data. This model is used by a Finnish company Basware to extract information from invoices in the form of machine-readable PDFs. The solution we propose is shown to produce confidence estimates, which outperform the legacy estimates on several relevant metrics, increasing coverage of automated invoices from 65.6% to 73.2% with no increase in error rate.
  • Häkkinen, Iira (2024)
    Foundation models have the potential to reduce the level of supervision required for medical image segmentation tasks. Currently, the medical image segmentation field still largely relies on supervised, task specific models. The aim of this thesis is to investigate if a foundation model, the Segment Anything Model (SAM), can be used to reduce the level of supervision needed for medical image segmentation. The main goal of this thesis is to see if the annotation workload required to generate labeled medical segmentation datasets can be significantly reduced with the help of Segment Anything Model. The second goal of this thesis is to validate the zero-shot performance of the Segment Anything Model on a medical segmentation dataset. A UNet model is used as a baseline. The results of this thesis give positive feedback on SAM's ability to be used as a tool for medical image annotation. During the experiments, it was found that especially for homogeneous, clearly outlined tasks, like organs, using ''pseudo labels'' generated by SAM for training a UNet model resulted in comparable accuracy with training a UNet model on human-annotated labels. Furthermore, the results show that zero-shot SAM has somewhat comparable performance to UNet, and even beats UNet in two of the experimented tasks. For one complexly structured task, SAM and UNet with pseudo labels, trained using SAM's masks, fail to produce accurate results. It is notable that some of the tasks have small training dataset sizes, which limits the test accuracy of UNet. The results are in accordance with recent literature which shows that zero-shot SAM can have comparable performance to state-of-the-art models with large and distinct objects, but when it comes to small, complex structures, SAM is not up to par accuracy-wise to the state-of-the-art medical segmentation models.