Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Title

Sort by: Order: Results:

  • Nissinen, Ulla (University of HelsinkiHelsingin yliopistoHelsingfors universitet, 2003)
    Verenkuva-analysaattorit ovat laitteita, jotka laskevat ja lajittelevat verisolut. Tietystä määrästä verta lasketaan punasolut, valkosolut ja verihiutaleet. Lisäksi niistä saa hematokriitin, hemoglobiinin ja punasoluindeksit. Eri eläinlajien ja ihmisten verisolut ovat kooltaan hyvin vaihtelevia, joten ihmisen verisolujen laskemiseen kehitetty laite ei sellaisenaan sovi eläinnäytteiden analysointiin. Lisäksi laitteissa on erilaisia laimentimia ja reagensseja, joiden vaikutus soluihin vaihtelee lajeittain. Kirjallisuusosassa on esitelty tällä hetkellä markkinoilla olevia eläinlaboratorioiden käyttöön tarkoitettuja laitteita ja niiden toiminta pääpiirteittäin. Tutkimusosassa vertailtiin ensin Cell-Dyn 3700 laitteen tekemää valkosolujen erittelylaskentaa verisivelystä tehtyyn manuaaliseen laskentaan. Näytteinä oli 65 kpl kissoista otettuja verinäytteitä, joista tehtiin sekä automaattinen että manuaalinen valkosolujen erittelylaskenta. Eri menetelmillä saadut tulokset olivat samankaltaiset neutrofiilien ja lymfosyyttien osalta. Monosyyttien osalta tulokset Cell-Dyn 3700 laitteella olivat hieman isompia ja eosinofiilien kohdalla pienempiä kuin manuaalisella laskennalla. Tulosten perusteella Cell-Dyn 3700 laitetta voidaan käyttää nykyisillä asetuksilla kissojen verinäytteiden valkosoluerittelyn tekemiseen. Toisessa osassa määritettiin kissan hematologiset viitearvot Cell-Dyn 3700 laitteella. Tätä varten kerättiin verinäytteet 43 terveeltä kissalta. Kaikki kissat olivat rauhoitettuja näytteenottohetkellä. Viitearvot laskettiin käyttäen ei-parametrista menetelmää. Valkosoluarvojen osalta viitearvot muodostuivat suunnilleen samoiksi kuin lähdekirjallisuudessa. Punasoluarvoista hematokriitin, hemoglobiinin ja punasolujen lukumäärän viitearvot muodostuivat lähdekirjallisuuden arvoja korkeammiksi ja punasoluvakioiden (MCV, MCH ja MCHC) viitearvot suunnilleen samoiksi kuin lähteissä.
  • Lumme, Eero (2021)
    Tutkielmassa tarkastellaan lainvalmistelussa esitettyjä näkemyksiä julkisen hallinnon automaattisen päätöksenteon sääntelytarpeesta. Automaattinen päätöksenteko on dataan ja algoritmeihin perustuvaa teknologiaa, joka kykenee tekemään päätöksiä täysin itsenäisesti tai yhdessä ihmistoimijan kanssa. Päätösautomaatiota hyödynnetään jo laajasti julkishallinnossa Suomessa. Kriittinen data- ja algoritmitutkimus on kuvannut useita teknologioiden käyttöön liittyviä haittoja. Teknologiaan liitettyjen riskien lisäksi aikaisemmassa tutkimuksessa on pohdittu myös miten teknologiaa ja sen riskejä tulisi hallita esimerkiksi lainsäädännön keinoin. Tutkielman teoreettinen viitekehys on tieteen- ja teknologiantutkimuksen perinteeseen kuuluva ja odotusten sosiologiaan pohjautuva sosiotekninen kuvitelma. Sillä tarkoitetaan kollektiivista, institutionaalisesti vakiintunutta ja julkisesti esitettyä visiota tavoiteltavasta tulevaisuudesta. Sosioteknisen kuvitelman ajatellaan sekä muodostuvan että muodostavan sosioteknistä muutosta molemminpuolisessa vuorovaikutuksessa teknologian ja yhteiskunnan kanssa. Tarkasteltu lainvalmistelu avaa ikkunan automaattista päätöksentekoa koskevan sosioteknisen kuvitelman muodostumisprosessiin. Perustuslakivaliokunta on antanut useita lausuntoja, joissa automaattista päätöksentekoa koskeva lainsäädäntö on todettu puutteelliseksi. Oikeusministeriö käynnisti vuoden 2020 alkupuolella teknologiaa koskevan lainvalmistelun. Esiselvityksen jälkeen Oikeusministeriö julkaisi arviomuistion, joka lähetettiin sidosryhmille lausuntokierrokselle. Lausuntoja Oikeusministeriö vastaanotti yhteensä 65 kappaletta. Arviomuistio ja lausunnot muodostavat tutkielman aineiston, jota analysoitiin laadullista teoriaohjaavaa sisällönanalyysiä käyttäen. Asiakirjojen, menetelmän ja teorian yhdistelmästä muodostettiin temaattinen analyysi, jonka avulla koottiin kuva automaattista päätöksentekoa koskevan sosioteknisen kuvitelman elementeistä. Keskeisiä teemoja määriteltiin kolme: teknologian yhteiskunnallinen rooli, näkemykset teknologian sääntelystä sekä valtion ja muun yhteiskunnan relaatio. Teknologian rooliksi miellettiin yleistä hyvää ja Suomen kilpailukykyä luovana työkaluna oleminen. Mahdollistavan ja teknologianeutraalin lain nähtiin tuottavan laadukasta ja tehokasta hallintoa. Toisaalta teknologiaan liitettyihin riskeihin perustuen sen tiukkaa sääntelyä lailla tuettiin myös laajalti. Harkinnanvaraisuuden, oppivan tekoälyn sekä ihmisen ja koneen toimijuuden käsitteiden täsmällisen määrittelyn vaikeus hankaloitti riskien ja etujen vertailua aineistossa. Lainvalmistelu miellettiin valtion ja kansalaisyhteiskunnan väliseksi osallistavaksi dialogiksi. Kansalaisten rooli nähtiin enemmän demokraattisen oikeusvaltion jäsenyytenä kuin palveluja valikoivana kuluttajakansalaisuutena. Lainvalmistelun kautta saatu kuva automaattisen päätöksenteon sosioteknisestä kuvitelmasta ei ole vielä täysin vakiintunut ja kollektiivisesti jaettu, vaan se jatkaa muodostumistaan lainsäädäntöprosessin edetessä.
  • Pietarinen, Julius (2023)
    Soil compaction has a major effect on the soil fertility. It is important to find the compacted regions in the field for preventive actions. With a soil penetrometer, it is possible to find compactions from the soil. Hand-operated penetrom- eters are physically demanding to operate. By automating the measurement with a machine, the stress can be removed from the user, acquire higher amounts of data and achieve more accurate and consistent data, in comparison to manual measurement techniques. Machine automation achieves constant penetration speed and higher forces than manual op- eration while being not dependent on the user. The built automated penetrometer (AP) can be attached to an ATV or other small offroad machine such as a field robot. The automated penetrometer was built using a S-force sensor and ballscrew operated by a stepper motor. The force sensor is mounted on a sled running on linear guiderails. The system control is done with an Arduino-microcontroller and data processing with a RasberryPi – minicomputer. Surface moisture is measured with the Meter Teros 10 – capac- itive soil moisture sensor during the operation. Initial tests were done in the Viikki research farm, Helsinki, Finland on a perennial grass ley field. The same measure- ments were done from the same areas with proven commercial Eijelkamp soil penetrologger. AP was found to be prac- tical and useful measurement device. With the material costs of 2000 euros, it is significantly cheaper compared to the commercial device. AP data had smaller variance compared to Eijelkamp. Open-source code makes easy to do modifi- cations and changes on the machine. Integrated computer makes real time data collection and processing easy. The developed measurement device will be later implemented to an automated field robot use.
  • Rannisto, Henri (2022)
    Suomen lentosäähavainnot käyvät läpi murrosta kohti automaatiota. Automaattisiin havaintoihin liittyy laatuongelmia, joten syntyi idea tehdä aiheesta laajempi tutkimus. Tutkimusaineistona käytettiin Rovaniemen lentoaseman havainnontekijöiden vuodesta 2011 lähtien täyttämää verifiointitaulukkoa, jossa ideana on kirjata manuaalisen havainnon tekohetkellä ylös automaattijärjestelmän tarjoamat arvot eri sääsuureille. Vertailtavat parametrit ovat näkyvyys, pilven alaraja ja vallitseva sää. Parametrien automaatin ja ihmisen määrittämät arvot ristiintaulukoitiin jokaiselle kolmelle parametrille erikseen. Tulokset eivät antaneet kovin hyvää kuvaa automaattihavaintojen nykyisestä laadusta, sillä kaikkien kolmen parametrin osalta havainnoista löytyi merkittäviä puutteita arvojen tarkkuudessa ja ajantasaisuudessa. Erot tarkaksi oletettuihin ihmishavaintoihin olivat niin suuria, että esiin nousi kysymyksiä lentoturvallisuuteen ja automaattihavaintojen käytön järkevyyteen liittyen. Tulosten pohjalta esitetään ratkaisuksi merkittäviä parannuksia havaintojärjestelmään sekä havaintojen tilapäistä manualisointia parannusprosessin ajaksi. Tutkielmassa käydään varsinaisen tutkimusosion lisäksi läpi Suomen lentosäähavaintojen teoriaa. Tekstissä pureudutaan syvemmin manuaalisen ja automaattisen havaintomenetelmän perusperiaatteisiin sekä esitellään Suomen lentosäähavaintojen historiaa pääpiirteittäin.
  • Kinnunen, Samuli (2024)
    Chemical reaction optimization is an iterative process that targets identifying reaction conditions that maximize reaction output, typically yield. The evolution of optimization techniques has progressed from intuitive approaches to simple heuristics, and more recently, to statistical methods such as Design of Experiments approach. Bayesian optimization, which iteratively updates beliefs about a response surface and suggests parameters both exploiting conditions near the known optima and exploring uncharted regions, has shown promising results by reducing the number of experiments needed for finding the optimum in various optimization tasks. In chemical reaction optimization, the method allows minimizing the number of experiments required for finding the optimal reaction conditions. Automated tools like pipetting robots hold potential to accelerate optimization by executing multiple reactions concurrently. The integration of Bayesian optimization to automation reduces not only the workload and throughput but also optimization efficiency. However, adoption of these advanced techniques faces a barrier, as chemists often lack proficiency in machine learning and programming. To bridge this gap, Automated Chemical Reaction Optimization Software (ACROS) is introduced. This tool orchestrates an optimization loop: Bayesian optimization suggests reaction candidates, the parameters are translated into commands for a pipetting robot, the robot executes the operations, a chemist interprets the results, and data is fed back to the software for suggesting the next reaction candidates. The optimization tool was evaluated empirically using a numerical test function, in a Direct Arylation reaction dataset, and in real-time optimization of Sonogashira and Suzuki coupling reactions. The findings demonstrate that Bayesian optimization efficiently identifies optimal conditions, outperforming Design of Experiments approach, particularly in optimizing discrete parameters in batch settings. Three acquisition functions; Expected Improvement, Log Expected Improvement and Upper Confidence Bound; were compared. It can be concluded that expected improvement-based methods are more robust, especially in batch settings with process constraints.
  • Thapa Magar, Purushottam (2021)
    Rapid growth and advancement of next generation sequencing (NGS) technologies have changed the landscape of genomic medicine. Today, clinical laboratories perform DNA sequencing on a regular basis, which is an error prone process. Erroneous data affects downstream analysis and produces fallacious result. Therefore, external quality assessment (EQA) of laboratories working with NGS data is crucial. Validation of variations such as single nucleotide polymor- phism (SNP) and InDels (<50 bp) is fairly accurate these days. However, detection and quality assessment of large changes such as the copy number variation (CNV) continues to be a concern. In this work, we aimed to study the feasibility of an automated CNV concordance analysis for the laboratory EQA services. We benchmarked variants reported by 25 laboratories against the highly curated gold standard for the son (HG002/NA24385) of the askenazim trio from the Personal Genome Project published by the Genome in a Bottle Consortium (GIAB). We employed two methods to conduct concordance of CNVs, the sequence based comparison with Truvari and the in-house exome-based comparison. For deletion calls of two whole genome sequencing (WGS) submissions, Truvari gained a value greater than 88% and 68% for precision and recall respectively. Conversely, the in-house method’s precision and recall score peaked at 39% and 7.9% respectively for one WGS submission for both deletion and duplication calls. The results indicate that automated CNV concordance analysis of the deletion calls for the WGS-based callset might be feasible with Truvari. On the other hand, results for panel-based targeted sequencing for the deletion calls showed precision and recall rates ranging from 0-80% and 0-5.6% respectively with Truvari. The result suggests that automated concordance analysis of CNVs for targeted sequencing remains a challenge. In conclusion, CNV concordance analysis depends on how the sequence data is generated.
  • Ilse, Tse (2019)
    Background: Electroencephalography (EEG) depicts electrical activity in the brain, and can be used in clinical practice to monitor brain function. In neonatal care, physicians can use continuous bedside EEG monitoring to determine the cerebral recovery of newborns who have suffered birth asphyxia, which creates a need for frequent, accurate interpretation of the signals over a period of monitoring. An automated grading system can aid physicians in the Neonatal Intensive Care Unit by automatically distinguishing between different grades of abnormality in the neonatal EEG background activity patterns. Methods: This thesis describes using support vector machine as a base classifier to classify seven grades of EEG background pattern abnormality in data provided by the BAby Brain Activity (BABA) Center in Helsinki. We are particularly interested in reconciling the manual grading of EEG signals by independent graders, and we analyze the inter-rater variability of EEG graders by building the classifier using selected epochs graded in consensus compared to a classifier using full-duration recordings. Results: The inter-rater agreement score between the two graders was κ=0.45, which indicated moderate agreement between the EEG grades. The most common grade of EEG abnormality was grade 0 (continuous), which made up 63% of the epochs graded in consensus. We first trained two baseline reference models using the full-duration recording and labels of the two graders, which achieved 71% and 57% accuracy. We achieved 82% overall accuracy in classifying selected patterns graded in consensus into seven grades using a multi-class classifier, though this model did not outperform the two baseline models when evaluated with the respective graders’ labels. In addition, we achieved 67% accuracy in classifying all patterns from the full-duration recording using a multilabel classifier.
  • Valta, Akseli Eero Juhana (2023)
    Puumala orthohantavirus (PUUV) is a single stranded negative sense RNA virus, carried by the bank vole (Myodes glareolus). Like other orthohantaviruses, it does not cause visible symptoms in the host species, but when transmitted to humans, it can cause a mild version of hemorrhagic fever with renal syndrome (HFRS) called nephropathia epidemica (NE). PUUV is the only pathogenic orthohantavirus that is endemic to Finland, where it has a relatively high incidence of approximately 35 in 100 000 inhabitants or 1000 to 3000 diagnosed cases annually. Here we describe a miniaturized immunofluorescence assay (mini-IFA) for measuring antibody response against PUUV from bank vole whole blood and heart samples as well as from patient serum samples. The method outline was based on the work done by Pietiäinen et al., (2022), but it was adapted for the detection of PUUV antibodies. Transfected cells expressing the PUUV structural proteins (N, GPC, Gn and Gc) were used instead of PUUV infected cells, which allowed for performing all steps outside of bio-safety level 3 (BSL3) conditions. This method also enables the simultaneous measurement of IgM, IgA and IgG antibody response from each sample in a more efficient and higher output manner, when compared to traditional immunofluorescence methods. Our results show that the method is effective for testing large amounts of samples for PUUV antibodies and it allows for quick and convenient access to high-quality images that can be used for both detecting interesting targets for future studies, as well as producing a visual archive of the test results.
  • Aaltonen, Topi (2024)
    Positron annihilation lifetime spectroscopy (PALS) is a method used to analyse the properties of materials, namely their composition and what kind of defects they might consist of. PALS is based on the annihilation of positrons with the electrons of a studied material. The average lifetime of a positron coming into contact with a studied material depends on the density of electrons in the surroundings of the positron, with higher densities of electrons naturally resulting in faster annihilations on average. Introducing positrons in a material and recording the annihilation times results in a spectrum that is, in general, a noisy sum of exponential decays. These decay components have lifetimes that depend on the different density areas present in the material, and relative intensities that depend on the fractions of each area in the material. Thus, the problem in PALS is inverting the spectrum to get the lifetimes and intensities, a problem known as exponential analysis in general. A convolutional neural network architecture was trained and tested on simulated PALS spectra. The aim was to test whether simulated data could be used to train a network that could predict the components of PALS spectra accurately enough to be usable on spectra gathered from real experiments. Reasons for testing the approach included trying to make the analysis of PALS spectra more automated and decreasing user-induced bias compared to some other approaches. Additionally, the approach was designed to require few computational resources, ideally being trainable and usable on a single computer. Overall, testing showed that the approach has some potential, but the prediction performance of the network depends on the parameters of the components of the target spectra, with likely issues being similar to those reported in previous literature. In turn, the approach was shown to be sufficiently automatable, particularly once training has been performed. Further, while some bias is introduced in specifying the variation of the training data used, this bias is not substantial. Finally, the network can be trained without considerable computational requirements within a sensible time frame.
  • Kovanen, Veikko (2020)
    Real estate appraisal, or property valuation, requires strong expertise in order to be performed successfully, thus being a costly process to produce. However, with structured data on historical transactions, the use of machine learning (ML) enables automated, data-driven valuation which is instant, virtually costless and potentially more objective compared to traditional methods. Yet, fully ML-based appraisal is not widely used in real business applications, as the existing solutions are not sufficiently accurate and reliable. In this study, we introduce an interpretable ML model for real estate appraisal using hierarchical linear modelling (HLM). The model is learned and tested with an empirical dataset of apartment transactions in the Helsinki area, collected during the past decade. As a result, we introduce a model which has competitive predictive performance, while being simultaneously explainable and reliable. The main outcome of this study is the observation that hierarchical linear modelling is a very potential approach for automated real estate appraisal. The key advantage of HLM over alternative learning algorithms is its balance of performance and simplicity: this algorithm is complex enough to avoid underfitting but simple enough to be interpretable and easy to productize. Particularly, the ability of these models to output complete probability distributions quantifying the uncertainty of the estimates make them suitable for actual business use cases where high reliability is required.
  • Kallonen, Leo (2020)
    RPA (Robotic process automation) is an emerging field in software engineering that is applied in a wide variety of industries to automate repetitive business processes. While the tools to create RPA projects have evolved quickly, testing in these projects has not yet received much attention. The purpose of this thesis was to study how the regression testing of RPA projects created using UiPath could be automated while avoiding the following most common pitfalls of test automation projects: unreliability, too high cost, lack of re-usable components and too difficult implementation. An automated regression test suite was created as a case study with UiPath for an existing RPA project that is currently being tested manually. The results imply that UiPath can be used to also create the regression test suite, not just the RPA project. The automated test suite could be used to run all the tests in the regression test suite that is currently run manually. The common test automation project pitfalls were also mostly avoided: the structure of the project can be re-used for other test projects, the project can recover from unexpected errors and the implementation of the tests does not require a high level of programming knowledge. The main challenge proved to be the implementation cost which was increased by the longer then expected test development time. Another finding was that the measures taken to address test automation project pitfalls will likely work only with RPA projects that are simpler or as complex as the sample RPA project. With more complex projects, there will also likely be more challenges with test data creation. As a result, for complex projects, manual regression testing could be a better option.
  • Vainio, Antero (2020)
    Nowadays the Internet is being used as a platform for providing a wide variety of different services. That has created challenges related to scaling IT infrastructure management. Cloud computing is a popular solution for scaling infrastructure, either by building a self-hosted cloud or by using cloud platform provided by external organizations. This way some the challenges related to large scale can be transferred to the cloud administrators. OpenStack is a group of open-source software projects for running cloud platforms. It is currently the most commonly used software for building private clouds. Since initially published by NASA and Rackspace, it has been used by various organizations such as Walmart, China Mobile and Cern nuclear research institute. The largest production deployments of OpenStack clouds consist of thousands of physical server computers located in multiple datacenters. The OpenStack community has created many deployment methods that take advantage of automated software configuration management. The deployment methods are built with state of the art software for automating different administrative tasks. They take different approaches to automating infrastructure management for OpenStack. This thesis compares some of the automated deployment methods for OpenStack and examines the benefits of using automation for configuration management. We present comparisons based on technical documentations as well as reference literature. Additionally, we conducted a questionnaire for OpenStack administrators about the use of automation. Lastly, we tested one of the deployment methods in a virtualized environment.
  • Stenudd, Juho (2013)
    This Master's Thesis describes one example on how to automatically generate tests for real-time protocol software. Automatic test generation is performed using model-based testing (MBT). In model-based testing, test cases are generated from the behaviour model of the system under test (SUT). This model expresses the requirements of the SUT. Many parameters can be varied and test sequences randomised. In this context, real-time protocol software means a system component of Nokia Siemens Networks (NSN) Long Term Evolution (LTE) base station. This component, named MAC DATA, is the system under test (SUT) in this study. 3GPP has standardised the protocol stack for the LTE eNodeB base station. MAC DATA implements most of the functionality of the Medium Access Control (MAC) and Radio Link Control (RLC) protocols, which are two protocols of the LTE eNodeB. Because complex telecommunication software is discussed here, it is challenging to implement MBT for the MAC DATA system component testing. First, the expected behaviour of a system component has to be modelled. Because it is not smart to model everything, the most relevant system component parts that need to be tested have to be discovered. Also, the most important parameters have to be defined from the huge parameter space. These parameters have to be varied and randomised. With MBT, a vast number of different kind of users can be created, which is not reasonable in manual test design. Generating a very long test case takes only a short computing time. In addition to functional testing, MBT is used in performance and worst-case testing by executing a long test case based on traffic models. MBT has been noticed to be suitable for challenging performance and worst-case testing. This study uses three traffic models: smartphone-dominant, laptop-dominant and mixed. MBT is integrated into continuous integration (CI) system, which automatically runs MBT test case generations and executions overnight. The main advantage of the MBT implementation is the possibility to create different kinds of users and simulate real-life system behaviour. This way, hidden defects can be found from test environment and SUT.
  • Lehtimäki, Laura (2019)
    The assessment of nonverbal interaction is currently based on observations, interviews and questionnaires. The quantitative methods for assessment of nonverbal interaction are few. Novel technology allows new ways to perform assessment, and new methods are constantly being developed. Many of them are based on movement tracking by sensors, cameras and computer vision. In this study the use of OpenPose, a pose estimation algorithm, was investigated in detection of nonverbal interactional events. The aim was to find out whether the same meaningful interactional events could be found from videos by the algorithm and by human annotators. Another purpose was to find out the best way to annotate the videos in a study like this. The research material consisted of four videos of a child and a parent blowing soap bubbles. The videos were first run by OpenPose to track the poses of the child and the parent frame by frame. The data obtained by the algorithm was further processed by Matlab to extract the activities of the child and the parent, the coupling of the activities and the closeness of child’s and parent’s hands at each time point. The videos were manually annotated in two different ways: Both the basic units, such as the gaze directions and thehandling soap bubble jar, and the interactional events, such as communication initiatives, turn-taking and joint attention, were annotated. The results obtained by the algorithm were visually compared to annotations. The communication initiatives and turn-taking could be seen as peaks in hand closeness and as alternation in activities. However, interaction events were not the only reasons that caused changes in hand closeness and in activities, so they could not be distinguished from other actions solely by these factors. There also existed interaction that was not related to jar handling, which could not be seen from the hand closeness curves. With current recording arrangements, the gaze directions could not be detected by the algorithm and therefore the moments of joint attention could not be determined either. In order to enable the detection of gaze directions in the future studies, the faces of subjects are visible all the time. Distinguishing individual interaction events may not be the best way to assess interaction, and the focus of assessment should be in global units, such as synchrony between interaction partners. The best way to annotate the videos depends on the aim of the study.
  • Lintunen, Milla (2023)
    Fault management in mobile networks is required for detecting, analysing, and fixing problems appearing in the mobile network. When a large problem appears in the mobile network, multiple alarms are generated from the network elements. Traditionally Network Operations Center (NOC) process the reported failures, create trouble tickets for problems, and perform a root cause analysis. However, alarms do not reveal the root cause of the failure, and the correlation of alarms is often complicated to determine. If the network operator can correlate alarms and manage clustered groups of alarms instead of separate ones, it saves costs, preserves the availability of the mobile network, and improves the quality of service. Operators may have several electricity providers and the network topology is not correlated with the electricity topology. Additionally, network sites and other network elements are not evenly distributed across the network. Hence, we investigate the suitability of a density-based clustering methods to detect mass outages and perform alarm correlation to reduce the amount of created trouble tickets. This thesis focuses on assisting the root cause analysis and detecting correlated power and transmission failures in the mobile network. We implement a Mass Outage Detection Service and form a custom density-based algorithm. Our service performs alarm correlation and creates clusters of possible power and transmission mass outage alarms. We have filed a patent application based on the work done in this thesis. Our results show that we are able to detect mass outages in real time from the data streams. The results also show that detected clusters reduce the number of created trouble tickets and help reduce of the costs of running the network. The number of trouble tickets decreases by 4.7-9.3% for the alarms we process in the service in the tested networks. When we consider only alarms included in the mass outage groups, the reduction is over 75%. Therefore continuing to use, test, and develop implemented Mass Outage Detection Service is beneficial for operators and automated NOC.
  • Suomalainen, Lauri (2019)
    Hybrid Clouds are one of the most notable trends in the current cloud computing paradigm and bare-metal cloud computing is also gaining traction. This has created a demand for hybrid cloud management and abstraction tools. In this thesis I identify shortcomings in Cloudify’s ability to handle generic bare-metal nodes. Cloudify is an open- source vendor agnostic hybrid cloud tool which allows using generic consumer-grade computers as cloud computing resources. It is not however capable to automatically manage joining and parting hosts in the cluster network nor does it retrieve any hardware data from the hosts, making the cluster management arduous and manual. I have designed and implemented a system which automates cluster creation and management and retrieves useful hardware data from hosts. I also perform experiments using the system which validate its correctness, usefulness and expandability.
  • Gafurova, Lina (2018)
    Automatic fall detection is a very important challenge in the public health care domain. The problem primarily concerns the growing population of the elderly, who are at considerably higher risk of falling down. Moreover, the fall downs for the elderly may result in serious injuries or even death. In this work we propose a solution for fall detection based on machine learning, which can be integrated into a monitoring system as a detector of fall downs in image sequences. Our approach is solely camera-based and is intended for indoor environments. For successful detection of fall downs, we utilize the combination of the human shape variation determined with the help of the approximated ellipse and the motion history. The feature vectors that we build are computed for sliding time windows of the input images and are fed to a Support Vector Machine for accurate classification. The decision for the whole set of images is based on additional rules, which help us restrict the sensitivity of the method. To fairly evaluate our fall detector, we conducted extensive experiments on a wide range of normal activities, which we used to oppose the fall downs. Reliable recognition rates suggest the effectiveness of our algorithm and motivate us for improvement.
  • Sutinen, Marjo (2017)
    Tämä Pro gradu -tutkielma käsittelee monivalintamuotoisten aukkotehtävien automaattista generointia suomen kielen sanataivutuksen harjoittelua varten. Aukkotehtävät ovat suosittu formaatti kielen opiskelussa ja kielitaidon arvioinnissa. Koska ne ovat muodoltaan melko hyvin kontrolloituja, niiden laatimisen automatisointi on ollut useiden akateemisten ja kaupallisten projektien tavoitteena viimeisten parin vuosikymmenen ajan. Tehtävä on osoittautunut haasteelliseksi. Jos aukkotehtävä generoidaan yksinkertaisesti poistamalla lauseesta sana, ja antamalla sen täyttäminen opiskelijalle tehtäväksi, käy helposti niin, ettei tehtävä ole mielekäs: usein näin tuotettuun aukkoon sopii monta vaihtoehtoista sanaa tai rakennetta. Yksi suurimmista haasteista aukkotehtävien generoinnissa on siis niin sanottu “aukkojen luotettavuus”: sen varmistaminen, että aukkoon sopiva ja epäsopiva vastaus pystytään erottamaan toisistaan. Yksi tapa varmistaa tämä on rajoittaa mahdollisten vastausten joukkoa antamalla vastausvaihtoehtoja, joiden tiedetään olevan vääriä. Tällöin automaattisen generoinnin haasteeksi nousee vääräksi tiedettyjen vaihtoehtojen löytäminen. Väärät vaihtoehdot eivät kuitenkaan saa olla sitä liian ilmeisellä tavalla: oikean vaihtoehdon valitsemisen täytyy muodostaa mielekäs haaste opiskelijalle. Tutkielmani pääasiallinen tavoite on tutkia luotettavien ja potentiaalisesti haastavien monivalintamuotoisten aukkotehtävien generoimista suomen kielen sanataivutuksen opiskelua varten. Kokeellisessa osiossa testaamaani metodia on aiemmin sovellettu menestyksekkäästi verrattavaan tarkoitukseen englannin kielen prepositioiden kontekstissa. Metodissa etsitään suuresta tekstikorpuksesta sellaisia prepositioita, jotka esiintyvät usein yhden aukon kontekstisanan kollokaationa, mutteivat koskaan kahden kontekstisanan kollokaationa samaan aikaan. Tavoitteeni on osoittaa, että metodia voi soveltaa myös suomen kielen taivutustehtävien generoimiseen. Testaan myös erityyppisten korpusten käyttöä tehtävän suorittamisessa, nimittäin yhtäältä peräkkäisyyteen perustuvia n-grammeja ja toisaalta syntaktiseen dependenssirakenteeseen perustuvia n-grammeja. Kokeellisen työn lisäksi erittelen työssäni kattavasti erilaisia tapoja muodostaa taivutusaukkotehtäviä, ja esittelen keksimäni aukkotehtävämallin. Keskeisin löydökseni on, että kyseisellä metodilla pystyy lisäämään aukkotehtävien luotettavuutta merkittävästi: sellaisissa testitapauksissa, joissa käytetty data on muutaman yksinkertaisen kriteerin mukaisesti arvioituna riittävää, jopa 80 % alun perin epäluotettavista aukoista muuttuu luotettaviksi. Lopussa pohdin tehtävien haasteellisuuden evaluointia sekä riittämättömän datan kysymyksiä. Mitä jälkimmäiseen tulee, argumentoin, että vaikka esille tulleiden datan riittävyyteen liittyvien haasteiden ratkaiseminen parantaisi tuloksia nykyisestään, voi metodia pitää tarkoitukseen sopivana jo sellaisenaan.
  • Huusari, Riikka (2016)
    This study is part of the TEKES funded Electric Brain -project of VTT and University of Helsinki where the goal is to develop novel techniques for automatic big data analysis. In this study we focus on studying potential methods for automated land cover type classification from time series satellite data. Developing techniques to identify different environments would be beneficial in monitoring the effects of natural phenomena, forest fires, development of urbanization or climate change. We tackle the arising classification problem with two approaches; with supervised and unsupervised machine learning methods. From the former category we use a technique called support vector machine (SVM), while from the latter we consider Gaussian mixture model clustering technique and its simpler variant, k-means. We introduce the techniques used in the study in chapter 1 as well as give motivation for the work. The detailed discussion of the data available for this study and the methods used for analysis is presented in chapter 2. In that chapter we also present the simulated data that is created to be a proof of concept for the methods. The obtained results for both the simulated data and the satellite data are presented in chapter 3 and discussed in chapter 4, along with the considerations for possible future works. The obtained results suggest that the support vector machines could be suitable for the task of automated land cover type identification. While clustering methods were not as successful, we were able to obtain as high as 93 % accuracy with the data available for this study with the supervised implementation.
  • Vehomäki, Varpu (2022)
    Social media provides huge amounts of potential data for natural language processing but using this data may be challenging. Finnish social media text differs greatly from standard Finnish and models trained on standard data may not be able to adequately handle the differences. Text normalization is the process of processing non-standard language into its standardized form. It provides a way to both process non-standard data with standard natural language processing tools and to get more data for training new tools for different tasks. In this thesis I experiment with bidirectional recurrent neural network models and models based on the ByT5 foundation model, as well as the Murre normalizer to see if existing tools are suitable for normalizing Finnish social media text. I manually normalize a small set of data from the Ylilauta and Suomi24 corpora to use as a test set. For training the models I use the Samples of Spoken Finnish corpus and Wikipedia data with added synthetic noise. The results of this thesis show that there are no existing tools suitable for normalizing Finnish written on social media. There is a lack of suitable data for training models for this task. The ByT5-based models perform better than the BRNN models.