Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "neural networks"

Sort by: Order: Results:

  • Hämäläinen, Kreetta (2021)
    Personalized medicine tailors therapies for the patient based on predicted risk factors. Some tools used for making predictions on the safety and efficacy of drugs are genetics and metabolomics. This thesis focuses on identifying biomarkers for the activity level of the drug transporter organic anion transporting polypep-tide 1B1 (OATP1B1) from data acquired from untargeted metabolite profiling. OATP1B1 transports various drugs, such as statins, from portal blood into the hepatocytes. OATP1B1 is a genetically polymorphic influx transporter, which is expressed in human hepatocytes. Statins are low-density lipoprotein cholesterol-lowering drugs, and decreased or poor OATP1B1 function has been shown to be associated with statin-induced myopathy. Based on genetic variability, individuals can be classified to those with normal, decreased or poor OATP1B1 function. These activity classes were employed to identify metabolomic biomarkers for OATP1B1. To find the most efficient way to predict the activity level and find the biomarkers that associate with the activity level, 5 different machine learning models were tested with a dataset that consisted of 356 fasting blood samples with 9152 metabolite features. The models included both a Random Forest regressor and a classifier, Gradient Boosted Decision Tree regressor and classifier, and a Deep Neural Network regressor. Hindrances specific for this type of data was the collinearity between the features and the large amount of features compared to the number of samples, which lead to issues in determining the important features of the neural network model. To adjust to this, the data was clustered according to their Spearman’s rank-order correlation ranks. Feature importances were calculated using two methods. In the case of neural network, the feature importances were calculated with permutation feature importance using mean squared error, and random forest and gradient boosted decision trees used gini impurity. The performance of each model was measured, and all classifiers had a poor ability to predict decreasead and poor function classes. All regressors performed very similarly to each other. Gradient boosted decision tree regressor performed the best by a slight margin, but random forest regressor and neural network regressor performed nearly as well. The best features from all three models were cross-referenced with the features found from y-aware PCA analysis. The y-aware PCA analysis indicated that 14 best features cover 95% of the explained variance, so 14 features were picked from each model and cross-referenced with each other. Cross-referencing highest scoring features reported by the best models found multiple features that showed up as important in many models.Taken together, machine learning methods provide powerful tools to identify potential biomarkers from untargeted metabolomics data.
  • Mukhtar, Usama (2020)
    Sales forecasting is crucial for run any retail business efficiently. Profits are maximized if popular products are available to fulfill the demand. It is also important to minimize the loss caused by unsold stock. Fashion retailers face certain challenges which make sales forecasting difficult for the products. Some of these challenges are the short life cycle of products and introduction of new products all around the year. The goal of this thesis is to study forecasting methods for fashion. We use the product attributes for products in a season to build a model that can forecast sales for all the products in the next season. Sales for different attributes are analysed for three years. Sales for different variables vary for values which indicate that a model fitted on product attributes may be used for forecasting sales. A series of experiments are conducted with multiple variants of the datasets. We implemented multiple machine learning models and compared them against each other. Empirical results are reported along with the baseline comparisons to answer research questions. Results from first experiment indicate that machine learning models are almost doing as good as the baseline model that uses mean values as predictions. The results may improve in the upcoming years when more data is available for training. The second experiment shows that models built for specific product groups are better than the generic models that are used to predict sales for all kinds of products. Since we observed a heavy tail in the data, a third experiment was conducted to use logarithmic sales for predictions, and the results do not improve much as compared to results from previous methods. The conclusion of the thesis is that machine learning methods can be used for attribute-based sales forecasting in fashion industry but more data is needed, and modeling specific groups of products bring better results.
  • Duong, Quoc Quan (2021)
    Discourse dynamics is one of the important fields in digital humanities research. Over time, the perspectives and concerns of society on particular topics or events might change. Based on the changing in popularity of a certain theme different patterns are formed, increasing or decreasing the prominence of the theme in news. Tracking these changes is a challenging task. In a large text collection discourse themes are intertwined and uncategorized, which makes it hard to analyse them manually. The thesis tackles a novel task of automatic extraction of discourse trends from large text corpora. The main motivation for this work lies in the need in digital humanities to track discourse dynamics in diachronic corpora. Machine learning is a potential method to automate this task by learning patterns from the data. However, in many real use-cases ground truth is not available and annotating discourses on a corpus-level is incredibly difficult and time-consuming. This study proposes a novel procedure to generate synthetic datasets for this task, a quantitative evaluation method and a set of benchmarking models. Large-scale experiments are run using these synthetic datasets. The thesis demonstrates that a neural network model trained on such datasets can obtain meaningful results when applied to a real dataset, without any adjustments of the model.
  • Bovellán, Jonne (2022)
    A lot of research is done in terms of time series forecasting. This research is conducted for example on the stock market data and the data derived from social media platforms. Several powerful methods are developed for time-series forecasting, including Bidirectional Long Short-Term Memory which is based on neural networks. TikTok is a social media platform that is focused on short videos. When a user posts a video to TikTok, they can also write a short textual description which can include hashtags. Often these hashtags describe things, events and trends that happen in the physical world. The possibility to have an accurate forecast of the future popularity of TikTok hashtags includes the financial potential it creates for individuals and organisations. As part of this thesis, an experimental study was conducted in order to forecast the popularity of TikTok hashtags. An algorithm based on Bidirectional Long-Short Term Memory was created that forecasts short-term and long-term popularity of a single hashtag based on its past. A data set which consisted of time series data for 9779 different TikTok hashtags was used in the development process. The created forecasting algorithm performs at a good level for forecasting the short-term popularity of a hashtag, but it is not suitable for long-term forecasting due to its bad performance.
  • Barin Pacela, Vitória (2021)
    Independent Component Analysis (ICA) aims to separate the observed signals into their underlying independent components responsible for generating the observations. Most research in ICA has focused on continuous signals, while the methodology for binary and discrete signals is less developed. Yet, binary observations are equally present in various fields and applications, such as causal discovery, signal processing, and bioinformatics. In the last decade, Boolean OR and XOR mixtures have been shown to be identifiable by ICA, but such models suffer from limited expressivity, calling for new methods to solve the problem. In this thesis, "Independent Component Analysis for Binary Data", we estimate the mixing matrix of ICA from binary observations and an additionally observed auxiliary variable by employing a linear model inspired by the Identifiable Variational Autoencoder (iVAE), which exploits the non-stationarity of the data. The model is optimized with a gradient-based algorithm that uses second-order optimization with limited memory, resulting in a training time in the order of seconds for the particular study cases. We investigate which conditions can lead to the reconstruction of the mixing matrix, concluding that the method is able to identify the mixing matrix when the number of observed variables is greater than the number of sources. In such cases, the linear binary iVAE can reconstruct the mixing matrix up to order and scale indeterminacies, which are considered in the evaluation with the Mean Cosine Similarity Score. Furthermore, the model can reconstruct the mixing matrix even under a limited sample size. Therefore, this work demonstrates the potential for applications in real-world data and also offers a possibility to study and formalize identifiability in future work. In summary, the most important contributions of this thesis are the empirical study of the conditions that enable the mixing matrix reconstruction using the binary iVAE, and the empirical results on the performance and efficiency of the model. The latter was achieved through a new combination of existing methods, including modifications and simplifications of a linear binary iVAE model and the optimization of such a model under limited computational resources.
  • Enwald, Joel (2020)
    Mammography is used as an early detection system for breast cancer, which is one of the most common types of cancer, regardless of one’s sex. Mammography uses specialised X-ray machines to look into the breast tissue for possible tumours. Due to the machine’s set-up as well as to reduce the radiation patients are exposed to, the number of X-ray measurements collected is very restricted. Reconstructing the tissue from this limited information is referred to as limited angle tomography. This is a complex mathematical problem and ordinarily leads to poor reconstruction results. The aim of this work is to investigate how well a neural network whose structure utilizes pre-existing models and known geometry of the problem performs at this task. In this preliminary work, we demonstrate the results on simulated two-dimensional phantoms and discuss the extension of the results to 3-dimensional patient data.
  • Elmnäinen, Johannes (2020)
    The Finnish Environment Institute (SYKE) has at least two missions which require surveying large land areas: finding invasive alien species and monitoring the state of Finnish lakes. Various methods to accomplish these tasks exist, but they traditionally rely on manual labor by experts or citizen activism, and as such do not scale well. This thesis explores the usage of computer vision to dramatically improve the scaling of these tasks. Specifically, the aim is to fly a drone over selected areas and use a convolutional neural network architecture (U-net) to create segmentations of the images. The method performs well on select biomass estimation task classes due to large enough datasets and easy-to-distinguish core features of the classes. Furthermore, a qualitative study of datasets was performed, yielding an estimate for a lower bound of number of examples for an useful dataset. ACM Computing Classification System (CCS): CCS → Computing methodologies → Machine learning → Machine learning approaches → Neural networks
  • Rosenberg, Otto (2023)
    Bayesian networks (BN) are models that map the mutual dependencies and independencies between a set of variables. The structure of the model can be represented as a directed acyclic graph (DAG), which is a graph where the nodes represent variables and the directed edges between variables represent a dependency. BNs can be either constructed by using knowledge of the system or derived computationally from observational data. Traditionally, BN structure discovery from observational data has been done through heuristic algorithms, but advances in deep learning have made it possible to train neural networks for this task in a supervised manner. This thesis provides an overview of BN structure discovery and discusses the strengths and weaknesses of the emerging supervised paradigm. One supervised method, the EQ-model, that uses neural networks for structure discovery using equivariant models, is also explored in further detail with empirical tests. Through a process of hyperparameter optimisation and moving to online training, the performance of the EQ-model is increased. The EQ-model is still observed to underperform in comparison to a competing score-based model, NOTEARS, but offers convenient features, such as dramatically faster runtime, that compensate for the reduced performance. Several interesting lines of further study that could be used to further improve the performance of the EQ-model are also identified.