Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by study line "no specialization"

Sort by: Order: Results:

  • Kinnunen, Samuli (2024)
    Chemical reaction optimization is an iterative process that targets identifying reaction conditions that maximize reaction output, typically yield. The evolution of optimization techniques has progressed from intuitive approaches to simple heuristics, and more recently, to statistical methods such as Design of Experiments approach. Bayesian optimization, which iteratively updates beliefs about a response surface and suggests parameters both exploiting conditions near the known optima and exploring uncharted regions, has shown promising results by reducing the number of experiments needed for finding the optimum in various optimization tasks. In chemical reaction optimization, the method allows minimizing the number of experiments required for finding the optimal reaction conditions. Automated tools like pipetting robots hold potential to accelerate optimization by executing multiple reactions concurrently. The integration of Bayesian optimization to automation reduces not only the workload and throughput but also optimization efficiency. However, adoption of these advanced techniques faces a barrier, as chemists often lack proficiency in machine learning and programming. To bridge this gap, Automated Chemical Reaction Optimization Software (ACROS) is introduced. This tool orchestrates an optimization loop: Bayesian optimization suggests reaction candidates, the parameters are translated into commands for a pipetting robot, the robot executes the operations, a chemist interprets the results, and data is fed back to the software for suggesting the next reaction candidates. The optimization tool was evaluated empirically using a numerical test function, in a Direct Arylation reaction dataset, and in real-time optimization of Sonogashira and Suzuki coupling reactions. The findings demonstrate that Bayesian optimization efficiently identifies optimal conditions, outperforming Design of Experiments approach, particularly in optimizing discrete parameters in batch settings. Three acquisition functions; Expected Improvement, Log Expected Improvement and Upper Confidence Bound; were compared. It can be concluded that expected improvement-based methods are more robust, especially in batch settings with process constraints.
  • Ilse, Tse (2019)
    Background: Electroencephalography (EEG) depicts electrical activity in the brain, and can be used in clinical practice to monitor brain function. In neonatal care, physicians can use continuous bedside EEG monitoring to determine the cerebral recovery of newborns who have suffered birth asphyxia, which creates a need for frequent, accurate interpretation of the signals over a period of monitoring. An automated grading system can aid physicians in the Neonatal Intensive Care Unit by automatically distinguishing between different grades of abnormality in the neonatal EEG background activity patterns. Methods: This thesis describes using support vector machine as a base classifier to classify seven grades of EEG background pattern abnormality in data provided by the BAby Brain Activity (BABA) Center in Helsinki. We are particularly interested in reconciling the manual grading of EEG signals by independent graders, and we analyze the inter-rater variability of EEG graders by building the classifier using selected epochs graded in consensus compared to a classifier using full-duration recordings. Results: The inter-rater agreement score between the two graders was κ=0.45, which indicated moderate agreement between the EEG grades. The most common grade of EEG abnormality was grade 0 (continuous), which made up 63% of the epochs graded in consensus. We first trained two baseline reference models using the full-duration recording and labels of the two graders, which achieved 71% and 57% accuracy. We achieved 82% overall accuracy in classifying selected patterns graded in consensus into seven grades using a multi-class classifier, though this model did not outperform the two baseline models when evaluated with the respective graders’ labels. In addition, we achieved 67% accuracy in classifying all patterns from the full-duration recording using a multilabel classifier.
  • Aaltonen, Topi (2024)
    Positron annihilation lifetime spectroscopy (PALS) is a method used to analyse the properties of materials, namely their composition and what kind of defects they might consist of. PALS is based on the annihilation of positrons with the electrons of a studied material. The average lifetime of a positron coming into contact with a studied material depends on the density of electrons in the surroundings of the positron, with higher densities of electrons naturally resulting in faster annihilations on average. Introducing positrons in a material and recording the annihilation times results in a spectrum that is, in general, a noisy sum of exponential decays. These decay components have lifetimes that depend on the different density areas present in the material, and relative intensities that depend on the fractions of each area in the material. Thus, the problem in PALS is inverting the spectrum to get the lifetimes and intensities, a problem known as exponential analysis in general. A convolutional neural network architecture was trained and tested on simulated PALS spectra. The aim was to test whether simulated data could be used to train a network that could predict the components of PALS spectra accurately enough to be usable on spectra gathered from real experiments. Reasons for testing the approach included trying to make the analysis of PALS spectra more automated and decreasing user-induced bias compared to some other approaches. Additionally, the approach was designed to require few computational resources, ideally being trainable and usable on a single computer. Overall, testing showed that the approach has some potential, but the prediction performance of the network depends on the parameters of the components of the target spectra, with likely issues being similar to those reported in previous literature. In turn, the approach was shown to be sufficiently automatable, particularly once training has been performed. Further, while some bias is introduced in specifying the variation of the training data used, this bias is not substantial. Finally, the network can be trained without considerable computational requirements within a sensible time frame.
  • Kovanen, Veikko (2020)
    Real estate appraisal, or property valuation, requires strong expertise in order to be performed successfully, thus being a costly process to produce. However, with structured data on historical transactions, the use of machine learning (ML) enables automated, data-driven valuation which is instant, virtually costless and potentially more objective compared to traditional methods. Yet, fully ML-based appraisal is not widely used in real business applications, as the existing solutions are not sufficiently accurate and reliable. In this study, we introduce an interpretable ML model for real estate appraisal using hierarchical linear modelling (HLM). The model is learned and tested with an empirical dataset of apartment transactions in the Helsinki area, collected during the past decade. As a result, we introduce a model which has competitive predictive performance, while being simultaneously explainable and reliable. The main outcome of this study is the observation that hierarchical linear modelling is a very potential approach for automated real estate appraisal. The key advantage of HLM over alternative learning algorithms is its balance of performance and simplicity: this algorithm is complex enough to avoid underfitting but simple enough to be interpretable and easy to productize. Particularly, the ability of these models to output complete probability distributions quantifying the uncertainty of the estimates make them suitable for actual business use cases where high reliability is required.
  • Lintunen, Milla (2023)
    Fault management in mobile networks is required for detecting, analysing, and fixing problems appearing in the mobile network. When a large problem appears in the mobile network, multiple alarms are generated from the network elements. Traditionally Network Operations Center (NOC) process the reported failures, create trouble tickets for problems, and perform a root cause analysis. However, alarms do not reveal the root cause of the failure, and the correlation of alarms is often complicated to determine. If the network operator can correlate alarms and manage clustered groups of alarms instead of separate ones, it saves costs, preserves the availability of the mobile network, and improves the quality of service. Operators may have several electricity providers and the network topology is not correlated with the electricity topology. Additionally, network sites and other network elements are not evenly distributed across the network. Hence, we investigate the suitability of a density-based clustering methods to detect mass outages and perform alarm correlation to reduce the amount of created trouble tickets. This thesis focuses on assisting the root cause analysis and detecting correlated power and transmission failures in the mobile network. We implement a Mass Outage Detection Service and form a custom density-based algorithm. Our service performs alarm correlation and creates clusters of possible power and transmission mass outage alarms. We have filed a patent application based on the work done in this thesis. Our results show that we are able to detect mass outages in real time from the data streams. The results also show that detected clusters reduce the number of created trouble tickets and help reduce of the costs of running the network. The number of trouble tickets decreases by 4.7-9.3% for the alarms we process in the service in the tested networks. When we consider only alarms included in the mass outage groups, the reduction is over 75%. Therefore continuing to use, test, and develop implemented Mass Outage Detection Service is beneficial for operators and automated NOC.
  • Mäki, Niklas (2023)
    Most graph neural network architectures take the input graph as granted and do not assign any uncertainty to its structure. In real life, however, data is often noisy and may contain incorrect edges or exclude true edges. Bayesian methods, which consider the input graph as a sample from a distribution, have not been deeply researched, and most existing research only tests the methods on small benchmark datasets such as citation graphs. As often is the case with Bayesian methods, they do not scale well for large datasets. The goal of this thesis is to research different Bayesian graph neural network architectures for semi-supervised node classification and test them on larger datasets, trying to find a method that improves the baseline model and is scalable enough to be used with graphs of tens of thousands of nodes with acceptable latency. All the tests are done twice with different amounts of training data, since Bayesian methods often excel with low amounts of data and in real life labeled data can be scarce. The Bayesian models considered are based on the graph convolutional network, which is also used as the baseline model for comparison. This thesis finds that the impressive performance of the Bayesian graph neural networks does not generalize to all datasets, and that the existing research relies too much on the same small benchmark graphs. Still, the models may be beneficial in some cases, and some of them are quite scalable and could be used even with moderately large graphs.
  • Paulamäki, Henri (2019)
    Tailoring a hybrid surface or any complex material to have functional properties that meet the needs of an advanced device or drug requires knowledge and control of the atomic level structure of the material. The atomistic configuration can often be the decisive factor in whether the device works as intended, because the materials' macroscopic properties - such as electrical and thermal conductivity - stem from the atomic level. However, such systems are difficult to study experimentally and have so far been infeasible to study computationally due to costly simulations. I describe the theory and practical implementation of a 'building block'-based Bayesian Optimization Structure Search (BOSS) method to efficiently address heterogeneous interface optimization problems. This machine learning method is based on accelerating the identification of a material's energy landscape with respect to the number of quantum mechanical (QM) simulations executed. The acceleration is realized by applying likelihood-free Bayesian inference scheme to evolve a Gaussian process (GP) surrogate model of the target landscape. During this active learning, various atomic configurations are iteratively sampled by running static QM simulations. An approximation of using chemical building blocks reduces the search phase space to manageable dimensions. This way the most favored structures can be located with as little computation as possible. Thus it is feasible to do structure search with large simulation cells, while still maintaining high chemical accuracy. The BOSS method was implemented as a python code called aalto-boss between 2016-2019, where I was the main author in co-operation with Milica Todorović and Patrick Rinke. I conducted a dimensional scaling study using analytic functions, which quantified the scaling of BOSS efficiency for fundamentally different functions when dimension increases. The results revealed the target function's derivative's important role to the optimization efficiency. The outcome will help people with choosing the simulation variables so that they are efficient to optimize, as well as help them estimate roughly how many BOSS iterations are potentially needed until convergence. The predictive efficiency and accuracy of BOSS was showcased in the conformer search of the alanine dipeptide molecule. The two most stable conformers and the characteristic 2D potential energy map was found with greatly reduced effort compared to alternative methods. The value of BOSS in novel materials research was showcased in the surface adsorption study of bifenyldicarboxylic acid on CoO thin film using DFT simulations. We found two adsorption configurations which had a lower energy than previous calculations and approximately supported the experimental data on the system. The three applications showed that BOSS can significantly reduce the computational load of atomistic structure search while maintaining predictive accuracy. It allows material scientists to study novel materials more efficiently, and thus help tailor the materials' properties to better suit the needs of modern devices.
  • Mäkelä, Noora (2022)
    Sum-product networks (SPN) are graphical models capable of handling large amount of multi- dimensional data. Unlike many other graphical models, SPNs are tractable if certain structural requirements are fulfilled; a model is called tractable if probabilistic inference can be performed in a polynomial time with respect to the size of the model. The learning of SPNs can be separated into two modes, parameter and structure learning. Many earlier approaches to SPN learning have treated the two modes as separate, but it has been found that by alternating between these two modes, good results can be achieved. One example of this kind of algorithm was presented by Trapp et al. in an article Bayesian Learning of Sum-Product Networks (NeurIPS, 2019). This thesis discusses SPNs and a Bayesian learning algorithm developed based on the earlier men- tioned algorithm, differing in some of the used methods. The algorithm by Trapp et al. uses Gibbs sampling in the parameter learning phase, whereas here Metropolis-Hasting MCMC is used. The algorithm developed for this thesis was used in two experiments, with a small and simple SPN and with a larger and more complex SPN. Also, the effect of the data set size and the complexity of the data was explored. The results were compared to the results got from running the original algorithm developed by Trapp et al. The results show that having more data in the learning phase makes the results more accurate as it is easier for the model to spot patterns from a larger set of data. It was also shown that the model was able to learn the parameters in the experiments if the data were simple enough, in other words, if the dimensions of the data contained only one distribution per dimension. In the case of more complex data, where there were multiple distributions per dimension, the struggle of the computation was seen from the results.
  • Koski, Jessica (2021)
    Acute lymphoblastic leukemia (ALL) is a hematological malignancy that is characterized by uncontrolled proliferation and blocked maturation of lymphoid progenitor cells. It is divided into B- and T-cell types both of which have multiple subtypes defined by different somatic genetic changes. Also, germline predisposition has been found to play an important role in multiple hematological malignancies and several germline variants that contribute to the ALL risk have already been identified in pediatric and familial settings. There are only few studies including adult ALL patients but thanks to the findings in acute myeloid leukemia, where they found the germline predisposition to consider also adult patients, there is now more interest in studying adult patients. The prognosis of adult ALL patients is much worse compared to pediatric patients and many are still lacking clear genetic markers for diagnosis. Thus, identifying genetic lesions affecting ALL development is important in order to improve treatments and prognosis. Germline studies can provide additional insight on the predisposition and development of ALL when there are no clear somatic biomarkers. Single nucleotide variants are usually of interest when identifying biomarkers from the genome, but also structural variants can be studied. Their coverage on the genome is higher than that of single nucleotide variants which makes them suitable candidates to explore association with prognosis. Copy number changes can be detected from next generation sequencing data although the detection specificity and sensitivity vary a lot between different software. Current approach is to identify the most likely regions with copy number change by using multiple tools and to later validate the findings experimentally. In this thesis the copy number changes in germline samples of 41 adult ALL patients were analyzed using ExomeDepth, CODEX2 and CNVkit.
  • Kähärä, Jaakko (2022)
    We study the properties of flat band states of bosons and their potential for all-optical switching. Flat bands are dispersionless energy bands found in certain lattice structures. The corresponding eigenstates, called flat band states, have the unique property of being localized to a small region of the lattice. High sensitivity of flat band lattices to the effects of interactions could make them suitable for fast, energy efficient switching. We use the Bose-Hubbard model and computational methods to study multi-boson systems by simulating the time-evolution of the particle states and computing the particle currents. As the systems were small, fewer than ten bosons, the results could be computed exactly. This was done by solving the eigenstates of the system Hamiltonian using exact diagonalization. We focus on a finite-length sawtooth lattice, first simulating weakly interacting bosons initially in a flat band state. Particle current is shown to typically increase linearly with interaction strength. However, fine-tuning the hopping amplitudes and boundary potentials, particle current through the lattice is highly suppressed. We use this property to construct a switch which is turned on by pumping the input with control photons. Inclusion of particle interactions disrupts the system, resulting in a large non-linear increase in particle current. We find that certain flat band lattices could be used as medium for an optical switch capable of controlling the transport of individual photons. In practice, highly optically nonlinear materials are required to reduce the switching time which is found to be inversely proportional to the interaction strength.
  • Horn, Matthew (2024)
    Long term monitoring programs gather important data to understand population trends and man- age biodiversity, including phenological data. The sampling of such data can suffer from left- censoring where the first occurrence of an event coincides with the first sampling time. This can lead to overestimation of the timing of species’ life history events and obscure phenological trends. This study develops a Bayesian survival model to predict and impute the true first occurrence times of Finnish moths in a given sampling season in left-censored cases, thereby estimating the amount of left-censoring and effectively "decensoring" the data. A simulation study was done to test the model on synthetic data and explore how effect size, the severity of censoring, and sampling fre- quency effect the inference. Forward feature selection was done over environmental covariates for a generalized linear survival model with logit link, incorporating both left-censoring and interval censoring. Five-fold cross validation was done to select the best model and see what covariates would be added during the feature selection process. The validation tested the model both in its ability to predict points that were not left-censored and those that were artificially left-censored. The final model included terms for cumulative Growing Degree Days, cumulative Chilling Days, mean spring temperature, cumulative rainfall, and daily minimum temperature, in addition to an intercept term. It was trained on all of the data and predictions were made for the true first occurrence times of the left-censored sites and years.
  • Koivula, Juho (2021)
    Kirjallisuuskatsauksessa käydään läpi erilaisia menetelmiä C3-substituoitujen indolien synteeseihin 2-alkenyylianiliinityyppisistä lähtöaineista, joiden bentsyylinen asema oli substituoitu. Erityistä huomiota kiinnitetään menetelmiin, joiden reaktiomekanismeiksi ehdotettiin radikaalimekanismeja. Myös näitä ehdotettuja radikaalimekanismeja esitellään tutkielmassa. Kokeellisessa työssä tutkittiin C3-subsituoitujen indolien hapettavaa synteesiä hiilikatalyytin avulla. Lähtöaineina käytettiin bentsyylisesti aryylisubstituoituja 2-alkenyylianiliinijohdannaisia. Joidenkin lähtöaineiden typpeen oli kiinnitetty metoksipyridiini, jonka kiinnitystä varten kehitettiin Buchwald-katalyysi. Hiilikatalyysit tuottivat hyviä saantoja. Korkea elektronitiheys, etenkin aniliinin bentseenirenkaan ja/tai typen aromaattisen substituentin korkea elektronitiheys, oli eduksi. Reaktion mekanismin ehdotetaan alkavan hapettumisella radikaalikationiksi, ja näitä hapetuspotentiaaleja laskettiin aiemmin raportoidun menetelmän mukaisesti. Mikäli indolin 5-renkaan substituutiot (N1, C2, C3) olivat tarpeeksi samankaltaisia, korkea elektronitiheys, matala hapetuspotentiaali ja hyvä saanto korreloivat. Indolin 5-renkaan substituutio on kuitenkin merkittävämpi tekijä kuin hapetuspotentiaali ja/tai korkea elektronitiheys. Pyridiini typen suojaryhmänä toimi katalyysissä ja se onnistuttiin poistamaan helposti. Metoksipyridiini toimi katalyysissä hyvin, mutta sen kvantitatiivinen poistaminen ei onnistunut.
  • Porna, Ilkka (2022)
    Despite development in many areas of machine learning in recent decades, still, changing data sources between the domain in a model is trained and the domain in the same model is used for predictions is a fundamental and common problem. In the area of domain adaptation, these circum- stances have been studied by incorporating causal knowledge about the information flow between features to be utilized in the feature selection for the model. That work has shown promising results to accomplish so-called invariant causal prediction, which means a prediction performance is immune to the change levels between domains. Within these approaches, recognizing the Markov blanket to the target variable has served as a principal workhorse to find the optimal starting point. In this thesis, we continue to investigate closely the property of invariant prediction performance within Markov blankets to target variable. Also, some scenarios with latent parents involved in the Markov blanket are included to understand the role of the related covariates around the latent parent effect to the invariant prediction properties. Before the experiments, we cover the concepts of Makov blankets, structural causal models, causal feature selection, covariate shift, and target shift. We also look into ways to measure bias between changing domains by introducing transfer bias and incomplete information bias, as these biases play an important role in the feature selection, often being a trade-off situation between these biases. In the experiments, simulated data sets are generated from structural causal models to conduct the testing scenarios with the changing conditions of interest. With different scenarios, we investigate changes in the features of Markov blankets between training and prediction domains. Some scenarios involve changes in latent covariates as well. As result, we show that parent features are generally steady predictors enabling invariant prediction. An exception is a changing target, which basically requires more information about the changes in other earlier domains to enable invariant prediction. Also, emerging with latent parents, it is important to have some real direct causes in the feature sets to achieve invariant prediction performance.
  • Tynkkynen, Jere (2022)
    This paper features two parts; a literature review discussing the recent development in using electrochemical gas sensors for pollutant detection and the use of sensor nodes in real-life locations, and an experimental section focusing on the kinetic study of nitrogen containing compounds utilizing in-tube extraction device. Growing interest towards personal safety have led to development of low-cost electrochemical sensors for personal safety, indoor air quality and leak detection applications. Heterojunctions and light illumination have emerged as an effective way to improve sensor performance, but the selectivity of electrochemical sensors remains relatively poor. Multiple sensors can be combined to create ‘E-noses’ which significantly improve the selectivity and compound identification. These E-noses have been deployed in some indoor locations, either being stationary in sensor networks or moved around by a robot or drone. All approaches have benefits and caveats associated to them, with the differences between individual sensors limiting sensor network use, and slow response and recovery times limiting the use of moving sensors. A novel micropump system was constructed to be used in the active air sampling together with in tube extraction (ITEX) and thermal desorption gas-chromatography (TD-GC-MS). The repeatability of this method was tested in a kinetic study of 10 selected nitrogen containing compounds in a custom-built permeation chamber. The breakthrough times and volumes of the compounds were investigated. Kinetic modelling was successful for 9 out of the 10 compounds with 1 compound behaving significantly different from the rest. The breakthrough times were always over 20 minutes and breakthrough volumes were around the 1000 ml region. Reproducibility was tested with multiple ITEX’s and samples were taken from five indoor locations. Three of the tested compounds were found in some of the samples.
  • Anttila, Jesse (2020)
    Visual simultaneous localization and mapping (visual SLAM) is a method for consistent self-contained localization using visual observations. Visual SLAM can produce very precise pose estimates without any specialized hardware, enabling applications such as AR navigation. The use of visual SLAM in very large areas and over long distances is not presently possible due to a number of significant scalability issues. In this thesis, these issues are discussed and solutions for them explored, culminating in a concept for a real-time city-scale visual SLAM system. A number of avenues for future work towards a practical implementation are also described.
  • Martinmäki, Tatu (2020)
    Tiivistelmä – Referat – Abstract Molecular imaging is visualization, characterization and quantification of biological processes at molecular and cellular levels of living organisms, achieved by molecular imaging probes and techniques such as radiotracer imaging, magnetic resonance imaging and ultrasound imaging. Molecular imaging is an important part of patient care. It allows detection and localization of disease at early stages, and it is also an important tool in drug discovery and development. Positron emission tomography (PET) is a biomedical imaging technique considered as one of the most important advances in biomedical sciences. PET is used for a variety of biomedical applications: i.e. imaging of divergent metabolism, oncology and neurology. PET is based on incorporation of positron emitting radionuclides to drug molecules. As prominent radionuclides used in PET are of short or ultra-short half-lives, the radionuclide is most often incorporated to the precursor in the last step of the synthesis. This has proven to be a challenge with novel targeted radiotracers, as the demand for high specific activity leads to harsh reaction conditions, often with extreme pH and heat which could denature the targeting vector. Click chemistry is a synthetic approach based on modular building blocks. The concept was originally developed for purposes of drug discovery and development. It has been widely utilized in radiopharmaceutical development for conjugating prosthetic groups or functional groups to precursor molecules. Click chemistry reactions are highly selective and fast due to thermodynamic driving force and occur with high kinetics in mild reaction conditions, which makes the concept ideal for development and production of PET radiopharmaceuticals. Isotope exchange (IE) radiosynthesis with trifluoroborate moieties is an alternative labeling strategy for a reasonably high yield 18F labeling of targeted radiopharmaceuticals. As the labeling conditions in IE are milder than in commonly utilized nucleophilic fluorination, the scope of targeting vectors can be extended to labile biomolecules expressing highly specific binding to drug targets, resulting to higher contrast in PET imaging. A trifluoroborate functionalized prosthetic group 3 was synthetized utilizing click chemistry reactions, purified with SPE and characterized with HPLC-MS and NMR (1H , 11B-, 13C-, 19F-NMR). [18F]3 was successfully radiolabeled with RCY of 20.1 %, incorporation yield of 22.3 ± 11.4 % and RCP of >95 %. TCO-functionalized TOC-peptide precursor 6 was synthetized from a commercial octreotide precursor and a commercially available click chemistry building block via oxime bond formation. 6 was characterized with HPLC-MS and purified with semi preparative HPLC. Final product [18F]7 was produced in a two-step radiosynthesis via IEDDA conjugation of [18F]3 and 6. [18F]7 was produced with RCY 1.0 ± 1.0 %, RCP >95 % and estimated molar activity of 0.7 ± 0.8 GBq/µmol. A cell uptake study was conducted with [18F]7 in AR42J cell line. Internalization and specific binding to SSTR2 were observed in vitro.
  • Laaksonen, Jenniina (2021)
    Understanding customer behavior is one of the key elements in any thriving business. Dividing customers into different groups based on their distinct characteristics can help significantly when designing the service. Understanding the unique needs of customer groups is also the basis for modern marketing. The aim of this study is to explore what types of customer groups exist in an entertainment service business. In this study, customer segmentation is conducted with k-prototypes, a variation of k-means clustering. K-prototypes is a machine learning approach partitioning a group of observations into subgroups. These subgroups have little variation within the group and clear differences when compared to other subgroups. The advantage of k-prototypes is that it can process both categorical and numeric data efficiently. The results show that there are significant and meaningful differences between customer groups emerging from k-prototypes clustering. These customer groups can be targeted based on their unique characteristics and their reactions to different types of marketing actions vary. The unique characteristics of the customer groups can be utilized to target marketing actions better. Other possibilities to benefit from customer segmentation include such as personalized views, recommendations and helping strategy level decision making when designing the service. Many of these require further technical development or deeper understanding of the segments. Data selection as well as the quality of the data has an impact on the results and those should be considered carefully when deciding future actions on customer segmentation.
  • Koivisto, Teemu (2021)
    Programming courses often receive large quantities of program code submissions to exercises which, due to their large number, are graded and students provided feedback automatically. Teachers might never review these submissions therefore losing a valuable source of insight into student programming patterns. This thesis researches how these submissions could be reviewed efficiently using a software system, and a prototype, CodeClusters, was developed as an additional contribution of this thesis. CodeClusters' design goals are to allow the exploration of the submissions and specifically finding higher-level patterns that could be used to provide feedback to students. Its main features are full-text search and n-grams similarity detection model that can be used to cluster the submissions. Design science research is applied to evaluate CodeClusters' design and to guide the next iteration of the artifact and qualitative analysis, namely thematic synthesis, to evaluate the problem context as well as the ideas of using software for reviewing and providing clustered feedback. The used study method was interviews conducted with teachers who had experience teaching programming courses. Teachers were intrigued by the ability to review submitted student code and to provide more tailored feedback to students. The system, while still a prototype, is considered worthwhile to experiment on programming courses. A tool for analyzing and exploring submissions seems important to enable teachers to better understand how students have solved the exercises. Providing additional feedback can be beneficial to students, yet the feedback should be valuable and the students incentivized to read it.
  • Pritom Kumar, Das (2024)
    Many researchers use dwell time, a measure of engagement with information, as a universal measure of importance in judging the relevance and usefulness of information. However, it may not fully account for individual cognitive variations. This study investigates how individual differences in cognitive abilities, specifically working memory and processing speed, can significantly impact dwell time. We examined the browsing behavior of 20 individuals engaged in information-intensive tasks, measuring their cognitive abilities, tracking their web page visits, and asses their perceived relevance and usefulness of the information encountered. Our findings shows a clear connection between working memory, processing speed, and dwell time. Based on this finding, we developed a model that combines both cognitive abilities and dwell time to predict the relevance and usefulness of web pages. Interestingly, our analysis reveals that cognitive abilities, particularly fluid intelligence, significantly influence dwell time. This highlights the importance of incorporating individual cognitive differences into prediction models to improve their accuracy. Thus, personalized services that set dwell time thresholds based on individual users' cognitive abilities could provide more accurate estimations of what users find relevant and useful.
  • Brasseur, Paul (2021)
    Plasmonic is an emerging field which has showed application for photocatlysis. Here we investigate a gold/platinum bimetallic catalytic system, and try to show how the catalytic properties of gold nanoparticles can be us to harvest visible light energy to increase the catalytic activity of platinum. Platinum being are rare and expensive metal, we also took the opportunity to find the optimal amount of catalyst to reduce platinum use. The catalyst is composed of a core spherical gold nanoparticles, of around 15 nm diameter. They were synthesized using an inversed Turkevich method, based on trisodium citrate, gold precursor salt and done in solution. Various amount of platinum was deposited on those nanoparticles using seeded growth method. The amount of platinum varied for single atoms to an atomic monolayer. This suspension of nanoparticles was deposited on ultrafine silica powder to be used for certain reaction and characterization. The material was characterized via several technics. UV-Visible and Diffuse Reflectance Spectroscopy were used to characterize its optical properties and showed a absorption peak around 524 nm characteristic of gold nanoparticles of this size. Imaging was done using electron microscopy (SEM and TEM) to study the morphology and showed monodisperse and spherical particles. The exact composition of the different catalyst were obtain using Atomic Emission Spectroscopy. The study was conducted by using reduction reaction as tests to investigate differences in conversion and selectivity under dark and monochromatic 525 nm and 427 nm light conditions. We chose to work on reduction of 4-nitrophenol, phenylacetylene and nitrobenzene, because they are widely used both in research and industry, and are easy to set up. Some catalyst showed good enhancement under 525 nm light, especially the one with the least amount of platinum. Different selectivity were also observed, indicating the presence of different reaction pathways under light conditions.