Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by discipline "Applied Mathematics"

Sort by: Order: Results:

  • Parvio, Matti (2014)
    Työssä tarkastellaan ääriarvoteorian perusteiden pohjalta kahta tunnettua ääriarvojakaumaa ja niiden antamia häntätodennäköisyyksiä havaintoaineistolle, joka esittää osaketuottojen kuukausitappioita. Lukijalla oletetaan olevan hallussa todennäköisyysteorian -ja laskennan perustiedot. Ääriarvoteoriassa ollaan kiinnostuneita jonkin havaintoaineiston otosmaksimien käyttäytymisestä sekä niiden jakaumasta. Kiinnostuksen kohteena on siis harvoin sattuvien tapahtumien todennäköisyydet eli havaintojen häntätodennäköisyydet ja tarkoitus on analysoida tiettyyn hetkeen mennessä havaittuja tapahtumia suurempien tapahtumien todennäköisyyksiä. Tutkielman alussa käydään lävitse ääriarvoteorian muutamia olennaisia tuloksia. Käsiteltävä teoria on nimenomaan klassista ääriarvoteoriaa, jossa havaintojen oletetaan olevan riippumattomia ja samoin jakautuneita. Olennainen tulos ääriarvoteoriassa on se, että mikäli sopivilla vakioilla normeerattu otosmaksimi suppenee jakaumaltaan kohti jotain ei-degeneroitunutta jakaumaa, kun otoskoko kasvaa rajatta, niin tällöin tämän jakauman täytyy olla tyypiltään yksi kolmesta standardista ääriarvojakaumasta Fréchet, Weibull tai Gumbel. Tällöin sanotaan, että otosmaksimin jakauma kuuluu ääriarvojakauman vaikutuspiiriin maksimin suhteen. Teorian käsittelyn jälkeen esitellään ääriarvoteorian kaksi tunnetuinta ääriarvojakaumaa. Ensimmäinen niistä on standardi yleistetty ääriarvojakauma eli ns. GEV-jakauma, joka pitää sisällään nuo kolme edellä mainittua standardia ääriarvojakaumaa. Toinen esiteltävä jakauma on yleistetty Pareto-jakauma eli ns. GP-jakauma, jonka jakaumaperheen jäsenet niinikään kuuluvat GEV-jakauman antamien ääriarvojakaumien vaikutuspiiriin maksimin suhteen. Molempien jakaumien avulla pystytään vähän eri menetelmin tutkimaan ääriarvojen tapahtumista jonkin tietyn havaintoaineiston pohjalta ja ekstrapoloimaan havaintoaineiston alueelle, jota ei paljon tunneta eli ääriarvoalueelle. Teorian ja jakaumien konkretisoimiseksi tutkielmassa käydään esimerkin avulla läpi minkälaisia tuloksia ääriarvojakaumilla voidaan saavuttaa. GEV-jakauman sovitus havaintoaineistoon tapahtuu ns. blokkimaksimimenetelmällä. Siinä aineisto jaetaan vuoden blokkeihin ja kustakin blokista poimitaan suurin osaketappio. Tämän jälkeen GEV-jakauma sovitetaan ns. suurimman uskottavuuden menetelmällä havaintoihin. GP-jakauman sovitus aineistoon tapahtuu ns. ylitemenetelmällä, jossa havaintoihin otetaan mukaan tietyn korkean tason ylittävät havainnot. Tämän jälkeen myös GP-jakauma sovitetaan havaintoihin suurimman uskottavuuden menetelmällä. Tuloksista käy ilmi, että molemmat jakaumat vaikuttavat sopivan suhteellisen hyvin havaintoihin joskin GP-jakauma antaa monipuolisempia tuloksia. Lopuksi kerrataan vielä käsiteltyjä asioita sekä kurotetaan esitellyn teorian ohi kohti yleisempää teoriaa. Klassinen ääriarvoteoria ei riippumattomuus oletuksineen nimittäin sellaisenaan sovi reaalimaailman havaintoaineistoon. Asia on tutkielmassa kuitenkin pääosin sivuutettu esityksen helpottamiseksi.
  • Xu, Yaya (2015)
    We show that the embedding of a matrix game in a mechanistic population dynamical model can have important consequences for the evolution of how the game is played. In particular, we show that because of this embedding evolutionary branching and multiple evolutionary singular strategies can occur, which is not possible in the conventional theory of matrix games. We show that by means of the example of the hawk-dove game.
  • Sirola, Johannes (2016)
    In this thesis we present an algorithm for doing mixture modeling for heterogeneous data collections. Our model supports using both Gaussian- and Bernoulli distributions, creating possibilities for analysis of many kinds of different data. A major focus is spent to developing scalable inference for the proposed model, so that the algorithm can be used to analyze even a large amount of data relatively fast. In the beginning of the thesis we review some required concepts from probability theory and then proceed to present the basic theory of an approximate inference framework called variational inference. We then move on to present the mixture modeling framework with examples of the Gaussian- and Bernoulli mixture models. These models are then combined to a joint model which we call GBMM for Gaussian and Bernoulli Mixture Model. We develop scalable and efficient variational inference for the proposed model using state-of-the-art results in Bayesian inference. More specifically, we use a novel data augmentation scheme for the Bernoulli part of the model coupled with overall algorithmic improvements such as incremental variational inference and multicore implementation. The efficiency of the proposed algorithm over standard variational inference is highlighted in a simple toy data experiment. Additionally, we demonstrate a scalable initialization for the main inference algorithm using a state-of-the-art random projection algorithm coupled with k-means++ clustering. The quality of the initialization is studied in an experiment with two separate datasets. As an extension to the GBMM model, we also develop inference for categorical features. This proves to be rather difficult and our presentation covers only the derivation of the required inference algorithm without a concrete implementation. We apply the developed mixture model to analyze a dataset consisting of electronic patient records collected in a major Finnish hospital. We cluster the patients based on their usage of the hospital's services over 28-day time intervals over 7 years to find patterns that help in understanding the data better. This is done by running the GBMM algorithm on a big feature matrix with 269 columns and more than 1.7 million rows. We show that the proposed model is able to extract useful insights from the complex data, and that the results can be used as a guideline and/or preprocessing step for possible further, more detailed analysis that is left for future work.
  • Kuukka, Antti Oskari (2013)
    The subject of this Master's Thesis is Shannon-McMillan-Breiman theorem, a famous and important result in information theory. Since the theorem is a statement about ergodic stochastic processes and its proof utilises Birkho 's ergodic theorem, a whole chapter has been devoted to ergodic theory. Ergodic theory has developed into a large branch of mathematics, and so the Chapter 1 is only a brief glance at the subject. Nevertheless, we will prove one of the most important theorems in ergodic theory, the before-mentioned Birkho 's ergodic theorem. This theorem is a strong statement about the average behaviour of certain stochastic processes (or dynamical systems), and it can be seen as a generalisation of the Strong Law of Large Numbers. Chapter 2 discusses information theory and the Shannon-McMillan-Breiman theorem. No previous knowledge about information theory is assumed, and therefore the chapter starts with an introduction to information theory. All fundamental de nitions and theorems concerning the entropy of discrete random variables are provided. After this introduction, we study the entropy of stochastic processes, which in turn leads us to the Asymptotic Equipartition Property (the AEP). Informally, a stochastic process has the AEP if almost all sample paths belong to a rather thin set, called the set of typical sequences, which despite having few elements contains most of the probability mass. Then we prove that independent and identically distributed processes have the AEP, and consider its consequences and applications such as data compression. After this, we present the Shannon-McMillan-Theorem which states that stationary, ergodic processes with infite state space have the AEP. The rest of the thesis is then devoted to the rather long, but interesting proof of the theorem.
  • Nikkilä, Mikko (2015)
    Often it would be useful to be able to calculate the 'shape' of a set of points even though it is not clear what is formally meant by the word 'shape.' The definitions of shapes presented in this thesis are generalizations of the convex hull of a set of points P which is the smallest convex set that contains all points of P. The k-hull of a point set P is a generalization of the convex hull. It is the complement of the union of all such halfplanes that contain at most k points of P. One application of the k-hull is measuring the data depth of an arbitrary point of the plane with respect to P. Another generalization of the convex hull is the α-hull of a point set P. It is the complement of the union of all α-disks (disks of radius α) that contain no points of the set P in their interior. The value of α controls the detail of the α-hull: 0-hull is P and ∞-hull is the convex hull of P. The α-shape is the 'straight-line' version of the α-hull. The α-hull and the α-shape reconstruct the shape in a more intuitive manner than the convex hull, recognizing that the set might have multiple distinct data clusters. However, α-hull and α-shape are very prone to outlier data that is often present in real-life datasets. A single outlier can drastically change the output shape. The k-order α-hull is a generalization of both the k-hull and the α-hull and as such it is a link between statistical data depth and shape reconstruction. The k-order α-hull of a point set P is the complement of the union of all such α-disks that contain at most k points of P. The k-order α-shape is the α-shape of those points of P that are not included in any of the α-disks. The k-order α-hull and the k-order α-shape can ignore a certain amount of the outlier data which the α-hull and the α-shape cannot. The detail of the shape can be controlled with the parameter α and the amount of outliers ignored with the parameter k. For example, the 0-order α-hull is the α-hull and the k-order ∞-hull is the k-hull. One possible application of the k-order α-shape is a visual representation of spatial data. Multiple k-order α-shapes can be drawn on a map so that shapes that are deeper in the dataset (larger values of k) are drawn with more intensive colors. Example datasets for geospatial visualization in this thesis include motor vehicle collisions in New York and unplanned stops of public transportation vehicles in Helsinki. Another application presented in this thesis is noise reduction from seismic time series using k-order α-disks. For each time tick, two disks of radius α are put above and below the data points. The upper disk is pulled downwards and the lower disk upwards until they contain exactly k data points inside. The algorithm's output for each time tick is the average of the centres of these two α-disks. With a good choice of parameters α and k, the algorithm successfully restores a good estimate of the original noiseless time series.
  • Tapanainen, Niko (2018)
    The past decade has brought about two key changes to the pricing of interest rate products in the European fixed income markets. In 2007, the Euribor-EONIA spread widened, which called into question the use of Euribor rates as a proxy for the risk-free rate. Nine years later, all of the main Euribor rates had fallen below zero. These changes forced market practitioners to reassess their assumptions, which resulted in the development of new models. The Heath-Jarrow-Morton (HJM) and Brace-Gatarek-Musiela (BGM) frameworks, which had gained popularity before the crisis, have served as a foundation for these models. In this thesis, we present two applications of Malliavin calculus to the pricing of interest rate derivatives within a multicurve BGM framework. Although the framework simplifies the pricing of interest rate derivatives, as it takes market rates as primitives, the pricing of exotic interest rate derivatives can pose a number of problems. The complexity of these products can lead to situations that require the use of computational techniques such as Monte Carlo simulation. This, in turn, provides an opening for the application of Malliavin calculus. We end this thesis by presenting a Malliavin-based formula for the price and the delta-sensitivity of a snowball, and discuss the merits of these representations in the context of Monte Carlo simulation. With reference to advances within the field during the last 5 years, we discuss the possible developments within the field that might garner further interest towards Malliavin calculus in the near future.
  • Pulkkinen, Seppo (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2008)
  • Lindberg, Jannica (2013)
    Ett försäkringsbolag kan teckna en återförsäkring hos ett annat försäkringsbolag, ett så kallat återförsäkringsbolag, för att förhindra att de försäkrade skadorna överstiger försäkringsbolagets kapacitet. På så vis ansvarar både försäkringsbolaget och återförsäkringsbolaget för skadorna. Skyddet kan delas upp antingen proportionellt eller icke-proportionellt. Vid en proportionell återförsäkring ansvarar återförsäkraren för en viss andel av försäkringen, medan vid den icke-proportionella återförsäkringen ansvarar återförsäkraren för skadebelopp som överstiger en förhandsbestämd gräns. Försäkringsbolaget måste betala åt återförsäkraren en återförsäkringspremie. Med ett självbehåll menas det högsta försäkrings- eller skadebelopp som ett försäkringsbolag behåller för egen räkning, dvs. utan återförsäkring. Vi kommer att bestämma den optimala självbehållsnivån för försäkringsgivaren vid en kvot- och en excess of loss återförsäkring. Skadevärderingskoefficienten R(a,M) används för att mäta effektiviteten av återförsäkringen. Skadevärderingskoefficienten är en funktion av gränsen för självbehållen a och M. Vi kommer att maximera skadevärderingskoefficienten över olika gränser.
  • Nousiainen, Jalo (2018)
    Tässä työssä tutustumme adaptiiviseen optiikkaan ja ilmakehätomografiaan liittyviin matemaattisiin ongelmiin. Adaptiivinen optiikka on menetelmä, joka pyrkii vähentämään ilmakehän turbulenssista aiheutuvia valon vaiheen vääristymiä ja näin parantamaan suurten optisten teleskooppien suorituskykyä. Adaptiivinen optiikka on tieteenala, joka yhdistää matematiikkaa, fysiikkaa ja insinööritieteitä. Ilmakehätomografialla taas tarkoitamme tiettyä matemaattista kuvantamisongelmaa, joka esiintyy seuraavan sukupolven teleskooppien adaptiivista optiikkaa käyttävissä systeemeissä. Tässä työssä esittelemme tavan mallintaa ilmakehätomografiaa matemaattisesti. Käsittelemme ilmakehän turbulenssia satunnaisfunktioiden avulla sekä esittelemme algoritmin, joka kattaa ilmakehätomografian sekä optisen systeemin kontrolloinnin. Lisäksi tutkimme tämän algoritmin toimintaa yhdessä alan tarkimmista simulointiympäristöistä. Erityisesti testasimme algoritmin vakautta olosuhteissa, joissa kohinataso datassa on korkea. Tutkimustulokset osoittavat, että työssä johdetut regularisointia käyttävät ratkaisumenetelmät parantavat systeemin suorituskykyä verrattuna yksinkertaisempaan ei-regularisoituun menetelmään.
  • Huusari, Riikka (2016)
    This study is part of the TEKES funded Electric Brain -project of VTT and University of Helsinki where the goal is to develop novel techniques for automatic big data analysis. In this study we focus on studying potential methods for automated land cover type classification from time series satellite data. Developing techniques to identify different environments would be beneficial in monitoring the effects of natural phenomena, forest fires, development of urbanization or climate change. We tackle the arising classification problem with two approaches; with supervised and unsupervised machine learning methods. From the former category we use a technique called support vector machine (SVM), while from the latter we consider Gaussian mixture model clustering technique and its simpler variant, k-means. We introduce the techniques used in the study in chapter 1 as well as give motivation for the work. The detailed discussion of the data available for this study and the methods used for analysis is presented in chapter 2. In that chapter we also present the simulated data that is created to be a proof of concept for the methods. The obtained results for both the simulated data and the satellite data are presented in chapter 3 and discussed in chapter 4, along with the considerations for possible future works. The obtained results suggest that the support vector machines could be suitable for the task of automated land cover type identification. While clustering methods were not as successful, we were able to obtain as high as 93 % accuracy with the data available for this study with the supervised implementation.
  • Havukainen, Joonas (2018)
    Tutkielma perehdyttää lukijan Steinin menetelmään normaaliapproksimaatiolle sekä esittää tämän avulla todistuksen Berry-Esseen-lauseelle. Steinin menetelmä on todennäköisyysteorian piiriin kuuluva nykyaikainen ja tehokas tapa tuottaa ylärajoja kahden eri todennäköisyysjakauman väliselle etäisyydelle. Tutkielmassa esitetään todennäköisyysjakaumien etäisyydelle kolme eniten käytettyä mittaa, jotka ovat Total variation, Kolmogorov sekä Wasserstein-mitat. Tämän jälkeen käydään läpi Steinin menetelmä aloittaen Steinin lemmasta, joka karakterisoi normaalijakauman Steinin operaattorin avulla siten, että operaattorin arvon ollessa nolla, on tarkasteltava jakauma normaali. Seuraavaksi esitetään Steinin yhtälöt, joiden ratkaisujen avulla saadaan Steinin rajoitukset jokaiselle käytetylle kolmelle mitalle. Näiden rajoitusten avulla voidaan päätellä asymptoottinen normaalijakautuneisuus myös silloin, kun Steinin operaattorin arvo on lähellä nollaa. Berry-Esseen-lause on keskeinen raja-arvolause, johon on erityisesti lisätty suppenemisnopeus Kolmogorov-etäisyyden suhteen. Tämä suppenemisnopeus todistetaan tutkielmassa käyttäen hyväksi Steinin menetelmää. Lopuksi käsitellään vielä ylimalkaisesti Steinin menetelmää moniulotteisen jakauman tapauksessa. Huomataan sen olevan hyvin paljon samankaltaista kuin yksiulotteisessa tapauksessa.
  • Hyvärinen, Tommi (2015)
    Tutkielmassani käsitellään Burgersin yhtälön nimellä tunnettua kvasilineaarista osittaisdifferentiaaliyhtälöä, sekä paneudutaan osittaisdifferentiaaliyhtälöiden teoriaan yleisemmin. Fyysikko Johannes Martinus Burgersin mukaan nimetyllä yhtälöllä voidaan kuvata häiriöiden etenemistä fluideissa, ja sitä voidaan soveltaa myös vaikkapa liikenneruuhkien kehittymisen analysointiin. Burgersin yhtälö on esimerkki yleisemmästä säilymislaista. Matemaattisessa fysiikassa säilymislakien mukaan eristetyssä systeemissä vuorovaikutustapahtumissa tiettyjen suureiden kokonaismäärät pysyvät muuttumattomina. Tunnetuin esimerkki säilymislakeihin liittyen on Noetherin lause, jonka mukaan suurella on vastaavuus tietyn systeemin symmetriaominaisuuden kanssa. Esimerkiksi nestedynamiikan Navier-Stokesin yhtälö on esimerkki epähomogeenisesta versiosta säilymislakia. Tutkimukseni alussa esitellään Hamilton-Jakobin yhtälö, sekä sivutaan variaatiolaskentaa, Legendren muunnosta ja Euler-Lagrangen yhtälöitä. Näytetään, miten annettu osittaisdifferentiaaliyhtälö voidaan samaistaa karakteristiseen yhtälöryhmään, joka koostuu tavallisista differentiaaliyhtälöistä. Karakteristinen yhtälöryhmä johdetaan kvasilineaarisen osittaisdifferentiaaliyhtälön tapauksessa ja sille annetaan geometrinen tulkinta. Ensimmäisen luvun lopussa Hamilton-Jacobin yhtälö ratkaistaan Hopf-Lax kaavan avulla. Toisessa ja kolmannessa luvussa esitellään Burgersin yhtälö ja ratkaistaan se karakteristisen yhtälöryhmän avulla. Saatu ratkaisu ei kuitenkaan päde kaikkialla, vaan tapauksissa, joissa karakteristiset käyrät kohtaavat ('shokkikäyrä'), Burgersin yhtälön ratkaisu vaatii ratkaisufunktion ehtojen heikentämistä ja ingraaliratkaisun määrittelemistä. Rankine-Hugoniot-ehto johdetaan ja sen avulla voidaan löytää ratkaisuja tilanteessa, jossa karakteristiset käyrät leikkaavat. Esittelen myös entropia-ehdon, jonka avulla karsitaan 'epäfysikaaliset' ratkaisut pois ja täten saadaan yksikäsitteinen ja yleinen ratkaisu Burgersin yhtälölle. Lopuksi todistan Lax-Oleinikin kaavan, joka antaa ratkaisun yleisemmälle ongelmalle. Lopuksi tälle ratkaisulle räätälöidään entropia-ehto, jotta siitä saadaan yksikäsitteinen.
  • Tuominen, Pekko (2016)
    Forecasting of solar power energy production would benefit from accurate sky condition predictions since the presence of clouds is a primary variable effecting the amount of radiation reaching the ground. Unfortunately the spatial and temporal resolution of often used satellite images and numerical weather prediction models can be too small for local, intra-hour estimations. Instead, digital sky images taken from the ground are used as data in this thesis. The two main building blocks needed to make sky condition forecasts are reliable cloud segmentation and cloud movement detection. The cloud segmentation problem is solved using neural networks, a double exposure imaging scheme, automatic sun locationing and a novel method to study the circumsolar region directly without the use of a sun occluder. Two different methods are studied for motion detection. Namely, a block matching method using cross-correlation as the similarity measure and the Lukas-Kanade method. The results chapter shows how neural networks overcome many of the situations labelled as difficult for other methods in the literature. Also, results by the two motion detection methods are presented and analysed. The use of neural networks and the Lukas-Kanade method show much promise for forming the cornerstone of local, intra-hour sky condition now-casting and prediction.
  • Knuutinen, Janne (2017)
    Copuloista on tullut yleinen työkalu finanssimaailman käyttötarkoituksiin. Tämän työn tavoitteena on esitellä copuloiden teoriaa ja sen soveltamista rahoitusriskien mallintamiseen. Copulat määritellään ja niihin liittyvää keskeistä teoriaa käydään läpi. Tärkeimpiä korrelaatiokonsepteja esitellään, muun muassa tunnusluvut Kendallin tau ja Spearmanin rho. Lisäksi copulaperheet, joihin eniten käytetyt copulat kuuluvat, määritellään. Copuloiden parametreja voi estimoida eri metodien avulla. Kolme tärkeintä loguskottavuusfunktioon perustuvaa metodia käydään läpi, samoin kuin Monte Carlo -menetelmä, jolla voidaan simuloida muuttujia copuloista. Esitellään häntäriippuvuus, joka on hyödyllinen käsite äärimmäisiä ilmiöitä mallinnettaessa. Value at Risk eli VaR on yksi tärkeimmistä sijoitusriskien riskimitoista. Uudelleenjärjestelyalgoritmiin perustuvan menetelmän avulla voidaan laskea huonoimmat ja parhaat VaR:n arvot. Menetelmän toimintaa havainnollistetaan järjestelemällä eräs matriisi algoritmin avulla niin, että nähdään huonoimman VaR:n yläraja. Menetelmää sovelletaan vielä kolmen eri osakkeen, Nokian, Samsungin ja Danske Bankin, useamman vuoden päivittäisistä tappioista koostetun matriisin uudelleenjärjestelyyn. Näin saatu huonoimman VaR:n yläraja on suurempi kuin historiallisen VaR:n arvo, joka laskettiin toisella menetelmällä. Tutkielman teorian käytännön soveltamista jatketaan vielä laskemalla osakkeiden tappioiden välisiä korrelaatioita. Nokian ja Deutsche Bankin tappioiden välisen korrelaatiokertoimen huomataan olevan arvoltaan suurin, ja todettaan, että niiden välistä riippuvuusrakennetta voidaan kuvata parhaiten t-copulalla.
  • Berg, Jeremias (2014)
    Clustering is one of the core problems of unsupervised machine learning. In a clustering problem we are given a set of data points and asked to partition them into smaller subgroups, known as clusters, such that each point is assigned to exactly one cluster. The quality of the obtained partitioning (clustering) is then evaluated according to some objective measure dependent on the specific clustering paradigm. A traditional approach within the machine learning community to solving clustering problems has been focused on approximative, local search algorithms that in general can not provide optimality guarantees of the clusterings produced. However, recent advances in the field of constraint optimization has allowed for an alternative view on clustering, and many other data analysis problems. The alternative view is based on stating the problem at hand in some declarative language and then using generic solvers for that language in order to solve the problem optimally. This thesis contributes to this approach to clustering by providing a first study on the applicability of state-of-the-art Boolean optimization procedures to cost-optimal correlation clustering under constraints in a general similarity-based setting. The correlation clustering paradigm is geared towards classifying data based on qualitative--- as opposed to quantitative similarity information of pairs of data points. Furthermore, correlation clustering does not require the number of clusters as input. This makes it especially well suited to problem domains in which the true number of clusters is unknown. In this thesis we formulate correlation clustering within the language of propositional logic. As is often done within computational logic, we focus only on formulas in conjunctive normal form (CNF), a limitation which can be done without loss of generality. When encoded as a CNF-formula the correlation clustering problem becomes an instance of partial Maximum Satisfiability (MaxSAT), the optimization version of the Boolean satisfiability (SAT) problem. We present three different encodings of correlation clustering into CNF-formulas and provide proofs of the correctness of each encoding. We also experimentally evaluate them by applying a state-of-the-art MaxSAT solver for solving the resulting MaxSAT instances. The experiments demonstrate both the scalability of our method and the quality of the clusterings obtained. As a more theoretical result we prove that the assumption of the input graph being undirected can be done without loss of generality, this justifies our encodings being applicable to all variants of correlation clustering known to us. This thesis also addresses another clustering paradigm, namely constrained correlation clustering. In constrained correlation clustering additional constraints are used in order to restrict the acceptable solutions to the correlation clustering problem, for example according to some domain specific knowledge provided by an expert. We demonstrate how our MaxSAT-based approach to correlation clustering naturally extends to constrained correlation clustering. Furthermore we show experimentally that added user knowledge allows clustering larger datasets, decreases the running time of our approach, and steers the obtained clusterings fast towards a predefined ground-truth clustering.
  • Nieminen, Arttu (2017)
    Differential privacy is a mathematically defined concept of data privacy that is based on the idea that a person should not face any additional harm by opting to give their data to a data collector. Data release mechanisms that satisfy the definition are said to be differentially private and they guarantee the privacy of the data on a specified privacy level by utilising carefully designed randomness that sufficiently masks the participation of each individual in the data set. The introduced randomness decreases the accuracy of the data analysis, but this effect can be diminished by clever algorithmic design. Robust private linear regression algorithm is a differentially private mechanism originally introduced by A. Honkela, M. Das, O. Dikmen, and S. Kaski in 2016. The algorithm is based on projecting the studied data inside known bounds and applying differentially private Laplace mechanism to perturb the sufficient statistics of the Bayesian linear regression model that is then fitted to the data using the privatised statistics. In this thesis, the idea, definitions and the most important theorems and properties of differential privacy are presented and discussed. The robust private linear regression algorithm is then presented in detail, including improvements that are related to determining and handling the parameters of the mechanism and were developed during my work as a research assistant in the Probabilistic Inference and Computational Biology research group (Department of Computer Science at University of Helsinki and Helsinki Institute for Information Technology) in 2016-2017. The performance of the algorithm is evaluated experimentally on both synthetic and real-life data. The latter data are from the Genomics of Drug Sensitivity in Cancer (GDSC) project and consist of the gene expression data of 985 cancer cell lines and their responses to 265 different anti-cancer drugs. The studied algorithm is applied to the GDSC data with the goal of predicting which cancer cell lines are sensitive to each drug and which are not. The application of a differentially private mechanism to the gene expression data is justifiable because genomic data are identifying and carry highly sensitive information about e.g. an individual's phenotype, health, and risk of various diseases. The results presented in the thesis show the studied algorithm works as planned and is able to benefit from having more data: in the sense of prediction accuracy, it approaches the non-private version of the same algorithm as the size of the available data set increases. It also reaches considerably better accuracy than the three compared algorithms that are based on different differentially private mechanisms: private linear regression with no projection, output perturbed linear regression, and functional mechanism linear regression.
  • Lybeck, Lasse (2015)
    Speech is the most common form of human communication. An understanding of the speech production mechanism and the perception of speech is therefore an important topic when studying human communication. This understanding is also of great importance both in medical treatment regarding a patient's voice and in human-computer interaction via speech. In this thesis we will present a model for digital speech called the source-filter model. In this model speech is represented with two independent components, the glottal excitation signal and the vocal tract filter. The glottal excitation signal models the airflow created at the vocal folds, which works as the source for the created speech sound. The vocal tract filter describes how the airflow is filtered as it travels through the vocal tract, creating the sound radiated to the surrounding space from the lips, which we recognize as speech. We will also present two different parametrized models for the glottal excitation signal, the Rosenberg-Klatt model (RK-model) and the Liljencrants-Fant model (LF-model). The RK-model is quite simple, being parametrized with only one parameter in addition to the fundamental frequency of the signal, while the LF-model is more complex, taking in four parameters to define the shape of the signal. A transfer function for vocal tract filter is also derived from a simplified model of the vocal tract. Additionally, relevant parts of the theory of signal processing are presented before the presentation of the source-filter model. A relatively new model for glottal inverse filtering (GIF), called the Markov chain Monte Carlo method for glottal inverse filtering (MCMC-GIF) is also presented in this thesis. Glottal inverse filtering is a technique for estimating the glottal excitation signal from a recorded speech sample. It is a widely used technique for example in phoniatrics, when inspecting the condition of a patient's vocal folds. In practice the aim is to separate the measured signal into the glottal excitation signal and the vocal tract filter. The first method for solving glottal inverse filtering was proposed in the 1950s and since then many different methods have been proposed, but so far none of the methods have been able to yield robust estimates for the glottal excitation signal from recordings with a high fundamental frequency, such as women's and children's voices. Recently, using synthetic vowels, MCMC-GIF has been shown to produce better estimates for these kind of signals compared to other state of the art methods. The MCMC-GIF method requires an initial estimate for the vocal tract filter. This is obtained from the measurements with the iterative adaptive inverse filtering (IAIF) method. A synthetic vowel is then created with the RK-model and the vocal tract filter, and compared to the measurements. The MCMC method is then used to adjust the RK excitation parameter and the parameters for the vocal tract filter to minimize the error between the synthetic vowel and the measurements, and ultimately receive a new estimate for the vocal tract filter. The filter can then be used to calculate the glottal excitation signal from the measurements. We will explain this process in detail, and give numerical examples of the results of the MCMC-GIF method compared against the IAIF method.
  • Linnoinen, Krista (2013)
    Mathematics teaching has been an active field of research and development at the Department of Mathematics and Systems Analysis at Aalto University. This research has been motivated by a desire to increase the number of students that pass compulsory basic mathematics courses without compromising on standards. The courses aim to provide the engineering students with the mathematical skills needed in their degree programmes so it is essential that a proper foundation is laid. Since 2006, a web-based automated assessment system called STACK has been used on basic mathematics courses for supplementary exercises to aid learning at Aalto University. In this thesis, computer-aided mathematics teaching and, in particular, automated assessment are studied to investigate what effect attempting to solve online exercises has on mathematical proficiency. This is done by using a Granger causality test. For this, the first two of three basic courses are examined. The concepts relating to learning and computer-aided mathematics teaching as well as the developments, including Mumie, made at Aalto University are first presented. Then, the statistical methodology, the theoretical framework and the test procedure for Granger causality are described. The courses and data, which was collected from STACK and used to quantify mathematical proficiency for the Granger causality test, are then reviewed. Finally, the results and implications are presented. The Granger causality tests show that there exists a Granger-causal relationship such that mathematical proficiency affects the desire to attempt to solve exercises. This holds for both of the interpretations used for quantifying mathematical profiency and all variations of the penalty deducted for incorrect attempts. The results imply that the exercises are too difficult for the students and that students tend to give up quickly. Thus, the Granger causality tests produced statistically significant results to back up what teachers have always known: students are discouraged by failure, but encouraged by success. The results provide teachers with valuable information about the students' abilities and enable teachers to alter the teaching accordingly to better support the students' learning.
  • Kainulainen, Henna (2015)
    In this thesis we consider dynamic X-ray computed tomography (CT) for a two dimensional case. In X-ray CT we take X-ray projection images from many different directions and compute a reconstruction from those measurements. Sometimes the change over time in the imaged object needs to be taken into account, for example in cardiac imaging or in angiography. This is why we're looking at the dynamic (something changing in time, while taking the measurements) case. At the beginning of the thesis in chapter 2 we present some necessary theory on the subject. We first go through some general theory about inverse problems and the concentrate on X-ray CT. We talk about ill-posedness of inverse problems, regularization and the measurement proses in CT. Different measurement settings and the discretization of the continuous case are introduced. In chapter 3 we introduce a solution method for the problem: total variation regularization with Barzilai-Borwein minimization method. The Barzilai-Borwein minimization method is an iterative method and well suited for large scale problems. We also explain two different methods, the multi-resolution parameter choice method and the S-curve method, for choosing the regularization parameter needed in the minimization process. The 4th chapter shows the materials used in the thesis. We have both simulated and real measured data. The simulated data was created using a rendering software and for the real data we took X-ray projection images of a Lego robot. The results of the tests done on the data are shown in chapter 5. We did tests on both the simulated and the measured data with two di erent measurement settings. First assuming we have 9 xed source-detector pairs and then that we only one source-detector pair. For the case where we have only one pair, we tested the implemented regularization method we begin by considering the change in the imaged object to be periodic. Then we assume can only use some number of consecutive moments, based on the rate the object is chancing, to collect the data. Here we only get one X-ray projection image at each moment and we combine measurements from multiple di erent moments. In the last chapter, chapter 6, we discuss the results. We noticed that the regularization method is quite slow, at least partly because of the functions used in the implementation. The results obtained were quite good, especially for the simulated data. The simulated data had less details than the measured data, so it makes sense that we got better results with less data. Already with only four angles, we cold some details with the simulated data, and for the measured data with 8 angles and with 16 angles details were also visible in the case of measured data.
  • Angervuori, Ilari (2018)
    Purpose of the work is to give an elementary introduction to the Finite Element Method. We give an abstract mathematical formalization to the Finite Element Problem and work out how the method is a suitable method in approximating solutions of Partial Differential Equations. In Chapter 1 we give a concrete example code of Finite Element Method implementation with Matlab of a relatively simple problem. In Chapter 2 we give an abstract formulation to the problem. We introduce the necessary concepts in Functional Analysis. When Finite Element Method is interpreted in a suitable fashion, we can apply results of Functional Analysis in order to examine the properties of the solutions. We introduce the two equivalent formulations of weak formulation to differential equations: Galerkin’s formulation and minimizing problem. In addition we define necessary concepts regarding to certain function spaces. For example we define one crucial complete inner product space, namely Sobolev space. In Chapter 3 we define the building blocks of the element space: meshing and the elements. Elements consists of their geometric shape and of basis functions and functionals on the basis functions. We also introduce the concepts of interpolation and construct basis functions in one, two and three dimensions. In Chapter 4 we introduce implementation techniques in a rather broad sense. We introduce the crucial concepts of stiffness matrix and load vector. We introduce a procedure for implementing Poisson’s equation and Helmholt’z equation. We introduce one way of doing numerical integration by Gaussian quadrature points- and weights. We define the reference element and mathematical concepts relating to it. Reference element and Gaussian quadrature points are widely used techniques when implementing Finite Element Method with computer. In Chapter 5 we give a rigid analysis of convergence properties of Finite Element Method solution. We show that an arbitrary function in Sobolev space can be approximated arbitarily close by a certain polynomial, namely Sobolev polynomial. The accuracy of the approximation depends on the size of the mesh and degree of the polynomial. Polynomial approximation theory in Sobolev spaces have a connection to Finite Element Methods approximation properties through Cèa’s lemma. In Chapter 6 we give some examples of posteriori convergence properties. We compare Finite Element Method solution acquired with computer to the exact solution. Interesting convergence properties are found using linear- and cubic basis functions. Results seem to verify the properties acquired in Chapter 5.