Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by master's degree program "Magisterprogrammet i matematik och statistik"

Sort by: Order: Results:

  • Huggins, Robert (2023)
    In this thesis, we develop a Bayesian approach to the inverse problem of inferring the shape of an asteroid from time-series measurements of its brightness. We define a probabilistic model over possibly non-convex asteroid shapes, choosing parameters carefully to avoid potential identifiability issues. Applying this probabilistic model to synthetic observations and sampling from the posterior via Markov Chain Monte Carlo, we show that the model is able to recover the asteroid shape well in the limit of many well-separated observations, and is able to capture posterior uncertainty in the case of limited observations. We greatly accelerate the computation of the forward problem (predicting the measured light curve given the asteroid’s shape parameters) by using a bounding volume hierarchy and by exploiting data parallelism on a graphics processing unit.
  • Siurua, Joel (2023)
    Contacts between individuals play a central part in infectious disease modelling. Social or physical contacts are often determined through surveys. These types of contacts may not accurately represent the truly infectious contacts due to demographic differences in susceptibility and infectivity. In addition, surveyed data is prone to statistical biases and errors. For these reasons, a transmission model based on surveyed contact data may make predictions that are in conflict with real-life observations. The surveyed contact structure must be adjusted to improve the model and produce reliable predictions. The adjustment can be done in multiple different ways. We present five adjustment methods and study how the choice of method impacts a model’s predictions about vaccine effectiveness. The population is stratified into n groups. All five adjustment methods transform the surveyed contact matrix such that its normalised leading eigenvector (the model-predicted stable distribution of infections) matches the observed distribution of infections. The eigenvector method directly adjusts the leading eigenvector. It changes contacts antisymmetrically: if contacts from group i to group j increase, then contacts from j to i decrease, and vice versa. The susceptibility method adjusts the group-specific susceptibility of individuals. The changes in the contact matrix occur row-wise. Analogously, the infectivity method adjusts the group-specific infectivity; changes occur column-wise. The symmetric method adjusts susceptibility and infectivity in equal measure. It changes contacts symmetrically with respect to the main diagonal of the contact matrix. The parametrised weighting method uses a parameter 0 ≤ p ≤ 1 to weight the adjustment between susceptibility and infectivity. It is a generalisation of the susceptibility, infectivity and symmetric methods, which correspond to p = 0, p = 1 and p = 0.5, respectively. For demonstrative purposes, the adjustment methods were applied to a surveyed contact matrix and infection data from the COVID-19 epidemic in Finland. To measure the impact of the method on vaccination effectiveness predictions, the relative reduction of the basic reproduction number was computed for each method using Finnish COVID-19 vaccination data. We found that the eigenvector method has no impact on the relative reduction (compared to the unadjusted baseline case). As for the other methods, the predicted effectiveness of vaccination increased the more infectivity was weighted in the adjustment (that is, the larger the value of the parameter p). In conclusion, our study shows that the choice of adjustment method has an impact on model predictions, namely those about vaccination effectiveness. Thus, the choice should be considered when building infectious disease models. The susceptibility and symmetric methods seem the most natural choices in terms of contact structure. Choosing the ”optimal” method is a potential topic to explore in future research.
  • Parviainen, Katariina (2021)
    Tutkielmassa käsitellään avaruuden $\cc^n$ aitoja holomorfisia kuvauksia. Niiden määritelmät perustuvat aidon kuvauksen lauseeseen ja kuvausten holomorfisuuteen. Olkoon $\Omega,D\subset\cc^n$, ja olkoon $n>1$. Kuvaus $F:\Omega\to D$ on aito kuvaus, jos $F^{-1}(K)$ on kompakti $\Omega$:n osajoukko jokaiselle kompaktille joukolle $K\subset D$. Holomorfisuus tarkoittaa kuvauksen kompleksista analyyttisyyttä, kompleksista differentioituvuutta sekä sitä, että kuvaus toteuttaa Cauchy-Riemannin yhtälöt. Funktio $f$ on holomorfinen avaruuden $\cc^n$ avoimessa joukossa $\Omega$, jos sille pätee $f:\Omega\to\cc$, $f\in C^1(\Omega)$, ja jos se toteuttaa Cauchy-Riemannin yhtälöt $\overline{\partial}_jf=\frac{\partial f}{\partial\overline{z_j}}=0$ jokaiselle $j=1,\ldots,n$. Kuvaus $F=(f_1,\ldots,f_m):\Omega\to\cc^m$ on holomorfinen joukossa $\Omega$, jos funktiot $f_k$ ovat holomorfisia jokaisella $k=1,\ldots,m$. Jos $\Omega$ ja $D$ ovat kompleksisia joukkoja, ja jos $F:\Omega\to D$ on aito holomorfinen kuvaus, tällöin $F^{-1}(y_0)$ on joukon $\Omega$ kompakti analyyttinen alivaristo jokaiselle pisteelle $y_0\in D$. Aito kuvaus voidaan määritellä myös seuraavasti: Kuvaus $F:\Omega\to D$ on aito jos ja vain jos $F$ kuvaa reunan $\partial\Omega$ reunalle $\partial D$ seuraavalla tavalla: \[\text{jos}\,\{z_j\}\subset\Omega\quad\text{on jono, jolle}\,\lim_{j\to\infty}d(z_j,\partial\Omega)=0,\,\text{niin}\,\lim_{j\to\infty}d(F(z_j),\partial D)=0.\] Tämän määritelmän perusteella kuvausten $F:\Omega\to D$ tutkiminen johtaa geometriseen funktioteoriaan kuvauksista, jotka kuvaavat joukon $\partial\Omega$ joukolle $\partial D.$ Käy ilmi, että aidot holomorfiset kuvaukset laajenevat jatkuvasti määrittelyalueittensa reunoille. Holomorfisten kuvausten tutkiminen liittyy osaltaan Dirichlet-ongelmien ratkaisemiseen. Klassisessa Dirichlet-ongelmassa etsitään joukon $\partial\Omega\subset\mathbf{R}^m$ jatkuvalle funktiolle $f$ reaaliarvoista funktiota, joka on joukossa $\Omega$ harmoninen ja joukon $\Omega$ sulkeumassa $\overline{\Omega}$ jatkuva ja jonka rajoittuma joukon reunalle $\partial\Omega$ on kyseinen funktio $f$. Tutkielmassa käydään läpi määritelmiä ja käsitteitä, joista aidot holomorfiset kuvaukset muodostuvat, sekä avataan matemaattista struktuuria, joka on näiden käsitteiden taustalla. Tutkielmassa todistetaan aidolle holommorfiselle kuvaukselle $F:\Omega\to\Omega'$ ominaisuudet: $F$ on suljettu kuvaus, $F$ on avoin kuvaus, $F^{-1}(w)$ on äärellinen jokaiselle $w\in\Omega'$, on olemassa kokonaisluku $m$, jolle joukon $F^{-1}(w)$ pisetiden lukumäärä on $m$ jokaiselle $F$:n normaalille arvolle, joukon $F^{-1}(w)$ pisteiden lukumäärä on penempi kuin $m$ jokaiselle $F$:n kriittiselle arvolle, $F$:n kriittinen joukko on $\Omega'$:n nollavaristo, $F(V)$ on $\Omega'$:n alivaristo aina, kun $V$ on $\Omega$:n alivaristo, $F$ laajenee jatkuvaksi kuvaukseksi aidosti pseudokonveksien määrittelyjoukkojensa reunoille, $F$ kuvaa aidosti pseudokonveksin lähtöjoukkonsa jonon, joka suppenee epätangentiaalisesti kohti joukon reunaa, jonoksi joka suppenee hyväksyttävästi kohti kuvauksen maalijoukon reunaa, kuvaus $F$ avaruuden $\cc^n$ yksikköpallolta itselleen on automorfismi.
  • Nenonen, Veera (2022)
    Sosiaalietuudet ovat kokeneet monenlaisia muutoksia vuosien aikana, ja niihin liittyviä lakeja pyritään kehittämään jatkuvasti. Myös aivan viimesijaiseen valtion tarjoamaan taloudellisen tuen muotoon, toimeentulotukeen, on kohdistettu merkittäviä toimenpiteitä, mikä on vaikuttanut useiden suomalaisten elämään. Näistä toimenpiteistä erityisesti perustoimeentulotuen siirtäminen Kansaneläkelaitoksen vastuulle on vaatinut paljon sopeutumiskykyä tukea käsitteleviltä ja hakevilta tahoilta. Tämä on voinut herättää voimakkaitakin mielipiteitä, joiden ilmaisuun keskustelufoorumit ovat otollinen alusta. Suomen suurin keskustelufoorumi Suomi24 sisältää paljon yhteiskuntaan ja politiikkaan liittyviä keskusteluketjuja, joiden sisällön kartoittaminen kiinnostaviin aiheisiin liittyen voi tuottaa oikeanlaisilla menetelmillä mielenkiintoista ja hyödyllistä tietoa. Tässä tutkielmassa pyritään luonnollisen kielen prosessoinnin menetelmiä, tarkemmin aihemallinnusta, hyödyntämällä selvittämään, onko vuonna 2017 voimaan tulleen toimeentulotukilain muutos mahdollisesti näkynyt jollakin tavalla Suomi24-foorumin toimeentulotukea käsittelevissä keskusteluissa. Tutkimus toteutetaan havainnollistamalla valittua aineistoa erilaisilla visualisoinneilla sekä soveltamalla LDA algoritmia, ja näiden avulla yritetään havaita keskusteluiden keskeisimmät aiheet ja niihin liittyvät käsitteet. Jos toimeentulotukilain muutos on herättänyt keskustelua, se voisi ilmetä aiheista sekä niiden sisältämien sanojen käytön jakautumisesta ajalle ennen muutosta ja sen jälkeen. Myös aineiston rajaus ja poiminta tietokannasta, sekä aineiston esikäsittely aihemallinnusta varten kattaa merkittävän osan tutkimuksesta. Aineistoa testataan yhteensä kaksi kertaa, sillä ensimmäisellä kerralla havaitaan puutteita esikäsittelyvaiheessa sekä mallin sovittamisessa. Iterointi ei ole epätavanomaista tällaisissa tutkimuksissa, sillä vasta tuloksia tulkitessa saattaa nousta esille asioita, jotka olisi pitänyt ottaa huomioon jo edeltävissä vaiheissa. Toisella testauskerralla aiheiden sisällöistä nousi esille joitain mielenkiintoisia havaintoja, mutta niiden perusteella on vaikea tehdä päätelmiä siitä, näkyykö toimeentulotukilain muutos keskustelualustan viesteistä.
  • Tanskanen, Tomas (2022)
    Colorectal cancer (CRC) accounts for one in 10 new cancer cases worldwide. CRC risk is determined by a complex interplay of constitutional, behavioral, and environmental factors. Patients with ulcerative colitis (UC) are at increased risk of CRC, but effect estimates are heterogeneous, and many studies are limited by small numbers of events. Furthermore, it has been challenging to distinguish the effects of age at UC diagnosis and duration of UC. Multistate models provide a useful statistical framework for analyses of cancers and premalignant conditions. This thesis has three aims: to review the mathematical and statistical background of multistate models; to study maximum likelihood estimation in the illness-death model with piecewise constant hazards; and to apply the illness-death model to UC and CRC in a population-based cohort study in Finland in 2000–2017, considering UC as a premalignant state that may precede CRC. A likelihood function is derived for multistate models under noninformative censoring. The multistate process is considered as a multivariate counting process, and product integration is reviewed. The likelihood is constructed by partitioning the study time into subintervals and finding the limit as the number of subintervals tends to infinity. Two special cases of the illness-death model with piecewise constant hazards are studied: a simple Markov model and a non-Markov model with multiple time scales. In the latter case, the likelihood is factorized into terms proportional to Poisson likelihoods, which permits estimation with standard software for generalized linear models. The illness-death model was applied to study the relationship between UC and CRC in a population-based sample of 2.5 million individuals in Finland in 2000–2017. Dates of UC and CRC diagnoses were obtained from the Finnish Care Register for Health Care and the Finnish Cancer Registry, respectively. Individuals with prevalent CRC were excluded from the study cohort. Individuals in the study cohort were followed from January 1, 2000, to the date of first CRC diagnosis, death from other cause, emigration, or December 31, 2017, whichever came first. A total of 23,533 incident CRCs were diagnosed during 41 million person-years of follow-up. In addition to 8,630 patients with prevalent UC, there were 19,435 cases of incident UC. Of the 23,533 incident CRCs, 298 (1.3%) were diagnosed in patients with pre-existing UC. In the first year after UC diagnosis, the HR for incident CRC was 4.67 (95% CI: 3.07, 7.09) in females and 7.62 (95% CI: 5.65, 10.3) in males. In patients with UC diagnosed 1–3 or 4–9 years earlier, CRC incidence did not differ from persons without UC. When 10–19 years had passed from UC diagnosis, the HR for incident CRC was 1.63 (95% CI: 1.19, 2.24) in females and 1.29 (95% CI: 0.96, 1.75) in males, and after 20 years, the HR was 1.61 (95% CI: 1.13, 2.31) in females and 1.74 (95% CI: 1.31, 2.31) in males. Early-onset UC (age <40 years) was associated with a markedly increased long-term risk of CRC. The HR for CRC in early-onset UC was 4.13 (95% CI: 2.28, 7.47) between 4–9 years from UC diagnosis, 4.88 (95% CI: 3.46, 6.88) between 10–19 years, and 2.63 (95% CI: 2.01, 3.43) after 20 years. In this large population-based cohort study, we estimated CRC risk in persons with and without UC in Finland in 2000–2017, considering both the duration of UC and age at UC diagnosis. Patients with early-onset UC are at increased risk of CRC, but the risk is likely to depend on disease duration, extent of disease, attained age, and other risk factors. Increased CRC risk in the first year after UC diagnosis may be in part due to detection bias, whereas chronic inflammation may underlie the long-term excess risk of CRC in patients with UC.
  • Saarinen, Tapio (2019)
    Tutkielman tarkoituksena on johdattaa lukija Ext-funktorin ja ryhmien kohomologian määritelmien ja teorian äärelle ja siten tutustuttaa lukija homologisen algebran keskeisiin käsitteisiin. Ensimmäisessä luvussa esitellään tutkielman olettamia taustatietoja, algebran ja algebrallisen topologian peruskurssien sisältöjen lisäksi. Toisessa luvussa esitellään ryhmien laajennosongelma ja ratkaistaan se tapauksessa, jossa annettu aliryhmä on vaihdannainen. Ryhmälaajennosten näytetään olevan yksi yhteen -vastaavuudessa tietyn ryhmän alkioiden kanssa, ja lisäksi tutkitaan erityisesti niitä ryhmälaajennoksia, jotka ovat annettujen ryhmien puolisuoria tuloja. Vastaan tulevien kaavojen todetaan vastaavan eräitä singulaarisen koketjukompleksin määritelmässä esiintyviä kaavoja. Kolmannessa luvussa määritellään viivaresoluutio sekä normalisoitu viivaresoluutio, sekä niiden pohjalta ryhmien kohomologia. Aluksi määritellään teknisenä sivuseikkana G-modulin käsite, jonka avulla ryhmien toimintoja voi käsitellä kuten moduleita. Luvun keskeisin tulos on se, että viivaresoluutio ja normalisoitu viivaresoluutio ovat homotopiaekvivalentit -- tuloksen yleistys takaa muun muassa, että Ext-funktori on hyvin määritelty. Luvun lopuksi lasketaan syklisen ryhmän kohomologiaryhmät. Neljännessä luvussa määritellään resoluutiot yleisyydessään, sekä projektiiviset että injektiiviset modulit ja resoluutiot. Viivaresoluutiot todetaan projektiivisiksi, ja niiden homotopiatyyppien samuuden todistuksen todetaan yleistyvän projektiivisille ja injektiivisille resoluutioille. Samalla ryhmien kohomologian määritelmä laajenee, kun viivaresoluution voi korvata millä tahansa projektiivisella resoluutiolla. Luvussa määritellään myös funktorien eksaktisuus, ja erityisesti tutkitaan Hom-funktorin eksaktiuden yhteyttä projektiivisiin ja injektiivisiin moduleihin. Viidennessä luvussa määritellään oikealta johdetun funktorin käsite, ja sen erikoistapauksena Ext-funktori, joka on Hom-funktorin oikealta johdettu funktori. Koska Hom-funktori on bifunktori, on sillä kaksi oikealta johdettua funktoria, ja luvun tärkein tulos osoittaa, että ne ovat isomorfiset. Ryhmien kohomologian määritelmä laajenee entisestään, kun sille annetaan määritelmä Ext-funktorin avulla, mikä mahdollistaa ryhmien kohomologian laskemisen myös injektiivisten resoluutioiden kautta. Viimeiseen lukuun on koottu aiheeseen liittyviä asioita, joita tekstissä hipaistaan, mutta joiden käsittely jäi rajaussyistä tutkielman ulkopuolelle.
  • Suominen, Henri (2021)
    Online hypothesis testing occurs in many branches of science. Most notably it is of use when there are too many hypotheses to test with traditional multiple hypothesis testing or when the hypotheses are created one-by-one. When testing multiple hypotheses one-by-one, the order in which the hypotheses are tested often has great influence to the power of the procedure. In this thesis we investigate the applicability of reinforcement learning tools to solve the exploration – exploitation problem that often arises in online hypothesis testing. We show that a common reinforcement learning tool, Thompson sampling, can be used to gain a modest amount of power using a method for online hypothesis testing called alpha-investing. Finally we examine the size of this effect using both synthetic data and a practical case involving simulated data studying urban pollution. We found that, by choosing the order of tested hypothesis with Thompson sampling, the power of alpha investing is improved. The level of improvement depends on the assumptions that the experimenter is willing to make and their validity. In a practical situation the presented procedure rejected up to 6.8 percentage points more hypotheses than testing the hypotheses in a random order.
  • Mustonen, Aleksi (2021)
    Electrical impedance tomography is a differential tomography method where current is injected into a domain and its interior distribution of electrical properties are inferred from measurements of electric potential around the boundary of the domain. Within the context of this imaging method the forward problem describes a situation where we are trying to deduce voltage measurements on a boundary of a domain given the conductivity distribution of the interior and current injected into the domain through the boundary. Traditionally the problem has been solved either analytically or by using numerical methods like the finite element method. Analytical solutions have the benefit that they are efficient, but at the same time have limited practical use as solutions exist only for a small number of idealized geometries. In contrast, while numerical methods provide a way to represent arbitrary geometries, they are computationally more demanding. Many proposed applications for electrical impedance tomography rely on the method's ability to construct images quickly which in turn requires efficient reconstruction algorithms. While existing methods can achieve near real time speeds, exploring and expanding ways of solving the problem even more efficiently, possibly overcoming weaknesses of previous methods, can allow for more practical uses for the method. Graph neural networks provide a computationally efficient way of approximating partial differential equations that is accurate, mesh invariant and can be applied to arbitrary geometries. Due to these properties neural network solutions show promise as alternative methods of solving problems related to electrical impedance tomography. In this thesis we discuss the mathematical foundation of graph neural network approximations of solutions to the electrical impedance tomography forward problem and demonstrate through experiments that these networks are indeed capable of such approximations. We also highlight some beneficial properties of graph neural network solutions as our network is able to converge to an arguably general solution with only a relatively small training data set. Using only 200 samples with constant conductivity distributions, the network is able to approximate voltage distributions of meshes with spherical inclusions.
  • Aholainen, Kusti (2022)
    Tämän tutkielman tarkoitus on tarkastella robustien estimaattorien, erityisesti BMM- estimaattorin, soveltuvuutta ARMA(p, q)-prosessin parametrien estimointiin. Robustit estimaattorit ovat estimaattoreita, joilla pyritään hallitsemaan poikkeavien havaintojen eli outlierien vaikutusta estimaatteihin. Robusti estimaattori sietääkin outliereita siten, että outlierien läsnäololla havainnoissa ei ole merkittävää vaikutusta estimaatteihin. Outliereita vastaan saatu suoja kuitenkin yleensä näkyy menetettynä tehokkuutena suhteessa suurimman uskottavuuden menetelmään. BMM-estimaattori on Mulerin, Peñan ja Yohain Robust estimation for ARMA models-artikkelissa (2009) esittelemä MM-estimaattorin laajennus. BMM-estimaattori pohjautuu ARMA-mallin apumalliksi kehitettyyn BIP-ARMA-malliin, jossa innovaatiotermin vaikutusta rajoitetaan suodattimella. Ajatuksena on näin kontrolloida ARMA-mallin innovaatioissa esiintyvien outlierien vaikutusta. Tutkielmassa BMM- ja MM- estimaattoria verrataan klassisista menetelmistä suurimman uskottavuuden (SU) ja pienimmän neliösumman (PNS) menetelmiin. Tutkielman alussa esitetään tarvittava todennäköisyysteorian, aikasarja-analyysin sekä robustien menetelmien käsitteistö. Lukija tutustutetaan robusteihin estimaattoreihin ja motivaatioon robustien menetelmien taustalla. Outliereita sisältäviä aikasarjoja käsitellään tutkielmassa asymptoottisesti saastuneen ARMA-prosessin realisaatioina ja keskeisimmille kirjallisuudessa tunnetuille outlier-prosesseille annetaan määritelmät. Lisäksi kuvataan käsiteltyjen BMM-, MM-, SU- ja PNS-estimaattorien laskenta. Estimaattorien yhteydessä käsitellään lisäksi alkuarvomenetelmiä, joilla estimaattorien minimointialgoritmien käyttämät alkuarvot valitaan. Tutkielman teoriaosuudessa esitetään lauseet ja todistukset MM-estimaattorin tarkentuvuudesta ja asymptoottisesta normaaliudesta. Kirjallisuudessa ei kuitenkaan tunneta todistusta BMM-estimaattorin vastaaville ominaisuuksille, vaan samojen ominaisuuksien otaksutaan pätevän myös BMM-estimaattorille. Tulososuudessa esitetään simulaatiot, jotka toistavat Muler et al. artikkelissa esitetyt simulaatiot monimutkaisemmille ARMA-malleille. Simulaatioissa BMM- ja MM-estimaattoria verrataan keskineliövirheen suhteen SU- ja PNS-estimaattoreihin, verraten samalla eri alkuarvomenetelmiä samalla. Lisäksi estimaattorien asymptoottisia robustiusominaisuuksia käsitellään. Estimaattorien laskenta on toteutettu R- ohjelmistolla, missä BMM- ja MM-estimaattorien laskenta on toteutettu pääosin C++-kielellä. Liite käsittää BMM- ja MM- estimaattorien laskentaan tarvittavan lähdekoodin.
  • Pyrylä, Atte (2020)
    In this thesis we will look at the asymptotic approach to modeling randomly weighted heavy-tailed random variables and their sums. The heavy-tailed distributions, named after the defining property of having more probability mass in the tail than any exponential distribution and thereby being heavy, are essentially a way to have a large tail risk present in a model in a realistic manner. The weighted sums of random variables are a versatile basic structure that can be adapted to model anything from claims over time to the returns of a portfolio, while giving the primary random variables heavy-tails is a great way to integrate extremal events into the models. The methodology introduced in this thesis offers an alternative to some of the prevailing and traditional approaches in risk modeling. Our main result that we will cover in detail, originates from "Randomly weighted sums of subexponential random variables" by Tang and Yuan (2014), it draws an asymptotic connection between the tails of randomly weighted heavy-tailed random variables and the tails of their sums, explicitly stating how the various tail probabilities relate to each other, in effect extending the idea that for the sums of heavy-tailed random variables large total claims originate from a single source instead of being accumulated from a bunch of smaller claims. A great merit of these results is how the random weights are allowed for the most part lack an upper bound, as well as, be arbitrarily dependent on each other. As for the applications we will first look at an explicit estimation method for computing extreme quantiles of a loss distributions yielding values for a common risk measure known as Value-at-Risk. The methodology used is something that can easily be adapted to a setting with similar preexisting knowledge, thereby demonstrating a straightforward way of applying the results. We then move on to examine the ruin problem of an insurance company, developing a setting and some conditions that can be imposed on the structures to permit an application of our main results to yield an asymptotic estimate for the ruin probability. Additionally, to be more realistic, we introduce the approach of crude asymptotics that requires little less to be known of the primary random variables, we formulate a result similar in fashion to our main result, and proceed to prove it.
  • Häggblom, Matilda (2022)
    Modal inclusion logic is modal logic extended with inclusion atoms. It is the modal variant of first-order inclusion logic, which was introduced by Galliani (2012). Inclusion logic is a main variant of dependence logic (Väänänen 2007). Dependence logic and its variants adopt team semantics, introduced by Hodges (1997). Under team semantics, a modal (inclusion) logic formula is evaluated in a set of states, called a team. The inclusion atom is a type of dependency atom, which describes that the possible values a sequence of formulas can obtain are values of another sequence of formulas. In this thesis, we introduce a sound and complete natural deduction system for modal inclusion logic, which is currently missing in the literature. The thesis consists of an introductory part, in which we recall the definitions and basic properties of modal logic and modal inclusion logic, followed by two main parts. The first part concerns the expressive power of modal inclusion logic. We review the result of Hella and Stumpf (2015) that modal inclusion logic is expressively complete: A class of Kripke models with teams is closed under unions, closed under k-bisimulation for some natural number k, and has the empty team property if and only if the class can be defined with a modal inclusion logic formula. Through the expressive completeness proof, we obtain characteristic formulas for classes with these three properties. This also provides a normal form for formulas in MIL. The proof of this result is due to Hella and Stumpf, and we suggest a simplification to the normal form by making it similar to the normal form introduced by Kontinen et al. (2014). In the second part, we introduce a sound and complete natural deduction proof system for modal inclusion logic. Our proof system builds on the proof systems defined for modal dependence logic and propositional inclusion logic by Yang (2017, 2022). We show the completeness theorem using the normal form of modal inclusion logic.
  • Kukkola, Johanna (2022)
    Can a day be classified to the correct season on the basis of its hourly weather observations using a neural network model, and how accurately can this be done? This is the question this thesis aims to answer. The weather observation data was retrieved from Finnish Meteorological Institute’s website, and it includes the hourly weather observations from Kumpula observation station from years 2010-2020. The weather observations used for the classification were cloud amount, air pressure, precipitation amount, relative humidity, snow depth, air temperature, dew-point temperature, horizontal visibility, wind direction, gust speed and wind speed. There are four distinct seasons that can be experienced in Finland. In this thesis the seasons were defined as three-month periods, with winter consisting of December, January and February, spring consisting of March, April and May, summer consisting of June, July and August, and autumn consisting of September, October and November. The days in the weather data were classified into these seasons with a convolutional neural network model. The model included a convolutional layer followed by a fully connected layer, with the width of both layers being 16 nodes. The accuracy of the classification with this model was 0.80. The model performed better than a multinomial logistic regression model, which had accuracy of 0.75. It can be concluded that the classification task was satisfactorily successful. An interesting finding was that neither models ever confused summer and winter with each other.
  • Virtanen, Jussi (2022)
    In the thesis we assess the ability of two different models to predict cash flows in private credit investment funds. Models are a stochastic type and a deterministic type which makes them quite different. The data that has been obtained for the analysis is divided in three subsamples. These subsamples are mature funds, liquidated funds and all funds. The data consists of 62 funds, subsample of mature funds 36 and subsample of liquidated funds 17 funds. Both of our models will be fitted for all subsamples. Parameters of the models are estimated with different techniques. The parameters of the Stochastic model are estimated with the conditional least squares method. The parameters of the Yale model are estimated with the numerical methods. After the estimation of the parameters, the values are explained in detail and their effect on the cash flows are investigated. This helps to understand what properties of the cash flows the models are able to capture. In addition, we assess to both models' ability to predict cash flows in the future. This is done by using the coefficient of determination, QQ-plots and comparison of predicted and observed cumulated cash flows. By using the coefficient of determination we try to explain how well the models explain the variation around the residuals of the observed and predicted values. With QQ-plots we try to determine if the values produced of the process follow the normal distribution. Finally, with the cumulated cash flows of contributions and distributions we try to determine if models are able to predict the cumulated committed capital and returns of the fund in a form of distributions. The results show that the Stochastic model performs better in its prediction of contributions and distributions. However, this is not the case for all the subsamples. The Yale model seems to do better in cumulated contributions of the subsample of the mature funds. Although, the flexibility of the Stochastic model is more suitable for different types of cash flows and subsamples. Therefore, it is suggested that the Stochastic model should be the model to be used in prediction and modelling of the private credit funds. It is harder to implement than the Yale model but it does provide more accurate results in its prediction.
  • Lundström, Teemu (2022)
    Spatial graphs are graphs that are embedded in three-dimensional space. The study of such graphs is closely related to knot theory, but it is also motivated by practical applications, such as the linking of DNA and the study of chemical compounds. The Yamada polynomial is one of the most commonly used invariants of spatial graphs as it gives a lot of information about how the graphs sit in the space. However, computing the polynomial from a given graph can be computationally demanding. In this thesis, we study the Yamada polynomial of symmetrical spatial graphs. In addition to being symmetrical, the graphs we study have a layer-like structure which allows for certain transfer-matrix methods to be applied. There the idea is to express the polynomial of a graph with n layers in terms of graphs with n − 1 layers. This then allows one to obtain the polynomial of the original graph by computing powers of the so-called transfer-matrix. We introduce the Yamada polynomial and prove various properties related to it. We study two families of graphs and compute their Yamada polynomials. In addition to this, we introduce a new notational technique which allows one to ignore the crossings of certain spatial graphs and turn them into normal plane graphs with labelled edges. We prove various results related to this notation and show how it can be used to obtain the Yamada polynomial of these kinds of graphs. We also give a sketch of an algorithm with which one could, at least in principle, obtain the Yamada polynomials of larger families of graphs.
  • Rautio, Siiri (2019)
    Improving the quality of medical computed tomography reconstructions is an important research topic nowadays, when low-dose imaging is pursued to minimize the X-ray radiation afflicted on patents. Using lower radiation doses for imaging leads to noisier reconstructions, which then require postprocessing, such as denoising, in order to make the data up to par for diagnostic purposes. Reconstructing the data using iterative algorithms produces higher quality results, but they are computationally costly and not quite powerful enough to be used as such for medical analysis. Recent advances in deep learning have demonstrated the great potential of using convolutional neural networks in various image processing tasks. Performing image denoising with deep neural networks can produce high-quality and virtually noise-free predictions out of images originally corrupted with noise, in a computationally efficient manner. In this thesis, we survey the topics of computed tomography and deep learning for the purpose of applying a state-of-the-art convolutional neural network for denoising dental cone-beam computed tomography reconstruction images. We investigate how the denoising results of a deep neural network are affected if iteratively reconstructed images are used in training the network, as opposed to using traditionally reconstructed images. The results show that if the training data is reconstructed using iterative methods, it notably improves the denoising results of the network. Also, we believe these results can be further improved and extended beyond the case of cone-beam computed tomography and the field of medical imaging.
  • Ronkainen, Arttu (2023)
    Gaussiset prosessit ovat stokastisia prosesseja, joiden äärelliset osajoukot noudattavat multinormaa- lijakaumaa. Niihin pohjautuvien mallien käyttö on suosittua bayesiläisessä tilastotieteessä, sillä ne mahdollistavat monimutkaisten ajallisten tai avaruudellisten riippuvuuksien joustavan mallintami- sen. Gaussisen latentin muuttujan malleissa havaintojen oletetaan noudattavan ehdollista jakaumaa, joka riippuu priorijakaumaltaan gaussisen latentin prosessin saamista arvoista. Havaintoaineiston koostuessa kategorisista arvoista, ovat gaussisen latentin muuttujan mallit ovat laskennallisesti han- kalia, sillä latenttien muuttujien posteriorijakaumaa ei yleensä voida käsitellä analyyttisesti. Täl- löin posterioripäättelyyn on käytettävä analyyttisiä approksimaatioita tai numeerisia menetelmiä. Laskennalliset hankaluudet korostuvat entisestään, kun latentin gaussisen muuttujan kovarianssi- funktion parametreille asetetaan oma priorijakauma. Tässä työssä käsitellään approksimatiivisia menetelmiä, joita voidaan käyttää posterioripäättelyyn gaussisen latentin muuttujan malleissa. Työssä keskitytään pääasiallisesti usean luokan luokitte- lumalliin, jossa havaintomallina on softmax-funktio, mutta suuri osa esitellyistä ideoista on so- vellettavissa myös muille havaintomalleille. Tutkielmassa käsiteltäviä approksimatiivisia menetel- miä on kolme. Ensimmäinen menetelmä on bayesiläisessä tilastotieteessä usein käytetty, satunnai- sotantaan perustuva Markovin ketju Monte Carlo -menetelmä, joka on asymptoottisesti eksakti, mutta laskennallisesti raskas. Toinen menetelmä käyttää Laplace-approksimaatioksi kutsuttua ana- lyyttista approksimaatiota latentin muuttujan posteriorille, yhdessä Markovin ketju Monte Carlo -menetelmän kanssa. Kolmas menetelmä yhdistää Laplace-approksimaation ja hyperparametrien piste-estimoinnin. Käsiteltävien menetelmien perustana oleva teoria esitellään tutkielmassa mini- maalisesti, jonka jälkeen approksimatiivisten menetelmien suoriutumista vertaillaan usean luokan luokittelumallis- sa simuloidulla havaintoaineistolla. Vertailussa voidaan havaita Laplace-approksimaation vaikutus hyperparametrien, sekä latentin muuttujan posteriorijakaumiin.
  • Laiho, Aleksi (2022)
    In statistics, data can often be high-dimensional with a very large number of variables, often larger than the number of samples themselves. In such cases, selection of a relevant configuration of significant variables is often needed. One such case is in genetics, especially genome-wide association studies (GWAS). To select the relevant variables from high-dimensional data, there exists various statistical methods, with many of them relating to Bayesian statistics. This thesis aims to review and compare two such methods, FINEMAP and Sum of Single Effects (SuSiE). The methods are reviewed according to their accuracy of identifying the relevant configurations of variables and their computational efficiency, especially in the case where there exists high inter-variable correlations within the dataset. The methods were also compared to more conventional variable selection methods, such as LASSO. The results show that both FINEMAP and SuSiE outperform LASSO in terms of selection accuracy and efficiency, with FINEMAP producing sligthly more accurate results with the expense of computation time compared to SuSiE. These results can be used as guidelines in selecting an appropriate variable selection method based on the study and data.
  • Pyry, Silomaa (2024)
    This thesis is an empirical comparison of various methods of statistical matching applied to Finnish income and consumption data. The comparison is performed in order to map out some possible matching strategies for Statistics Finland to use in this imputation task and compare the applicability of the strategies within specific datasets. For Statistics Finland, the main point of performing these imputations is in assessing consumption behaviour in years when consumption-related data is not explicitly collected. Within this thesis I compared the imputation of consumption data by imputing 12 consumption variables as well as their sum using the following matching methods: draws from the conditional distribution distance hot deck, predictive mean matching, local residual draws and a gradient boosting approach. The used donor dataset is a sample of households collected for the 2016 Finnish Household Budget Survey (HBS). The recipient dataset is a sample of households collected for the 2019 Finnish Survey of Income and Living Conditions (EU-SILC). In order to assess the quality of the imputations, I used numerical and visual assessments concerning the similarity of the weighted distributions of the consumption variables. The applied numerical assessments were the Kolmogorov-Smirnov (KS) test statistic as well as the Hellinger Distance (HD), the latter of which was calculated for a categorical transformation of the consumption variables. Additionally, the similarities of the correlation matrices were assessed using correlation matrix distance. Generally, distance hot deck and predictive mean matching fared relatively well in the imputation tasks. For example, in the imputation of transport-related expenditure, both produced KS test statistics of approximately 0.01-0.02 and HD of approximately 0.05, whereas the next best-performing method received scores of 0.04 and 0.09, thus representing slightly larger discrepancies. Comparing the two methods, particularly in the imputation of semicontinuous consumption variables, distance hot deck fared notably better than the predictive mean matching approach. As an example, in the consumption expenditure of alcoholic beverages and tobacco, distance hot deck produced values of the KS test statistic and HD of approximately 0.01 and 0.02 respectively whereas the corresponding scores for predictive mean matching were 0.21 and 0.16. Eventually, I would recommend for further application a consideration of both predictive mean matching and distance hot deck depending on the imputation task. This is because predictive mean matching can be applied more easily in different contexts but in certain kinds of imputation tasks distance hot deck clearly outperforms predictive mean matching. Further assessment for this data should be done, in particular the results should be validated with additional data.
  • Mäkinen, Sofia (2023)
    In this thesis we consider the inverse problem for the one-dimensional wave equation. That is, we would like to recover the velocity function, the wave speed, from the equation given Neumann and Dirichlet boundary conditions, when the solution to the equation is known. It has been shown that an operator Λ corresponding to the boundary conditions determines the volumes of the domain of influence, which is the set where the travel time for the wave is limited. These volumes then in turn determine the velocity function. We present some theorems and propositions about determining the wave speed and present proofs for a few of them. Artificial neural networks are a form of machine learning widely used in various applications. It has been previously proven that a one-layer feedforward neural network with a non-polynomial activation function with some additional constraints on the activation function can approximate any continuous real valued functions. In this thesis we present proof of this result for a continuous non-polynomial activation function. Furthermore, in this thesis we apply two neural network architectures to the volume inversion problem, which means that we train the networks to approximate a single volume when the operator Λ is given. The neural networks in question are the feedforward neural network and the operator recurrent neural network. Before the volume inversion problem, we consider a simpler problem of finding an inverse matrix of a small invertible matrix. Finally, we compare the performances of these two neural networks for both the volume and matrix inversion problems.
  • Järvinen, Vili (2024)
    Työ käsittelee tunnettujen virtausdynamiikan yhtälöiden, Navier-Stokesin yhtälöiden ja Eulerin yhtälöiden välistä yhteyttä ja näiden ratkaisujen välisiä suppenemisehtoja. Työn ensimmäisessä kappaleessa esitellään työlle tärkeät perustiedot sisältäen esimerkiksi heikon derivaatan ja Sobolev-avaruuksien määritelmät ja useamman tärkeän funktioavaruuden ja jälkilauseiden määritelmät. Työn toinen kappale käsittelee tarkemmin Navier-Stokesin yhtälöitä ja Eulerin yhtälöitä. Kappaleessa esitellään ensin Navier-Stokesin yhtälöiden määritelmät ja sen jälkeen esitellään määritelmä ratkaisun olemassaololle. Kappaleen päätteeksi esitellään myös Eulerin yhtälön määritelmä. Neljännessä kappaleessa esitellään tutkielman pääaihe, eli Navier-Stokesin ja Eulerin yhtälöiden ratkaisujen välinen yhteys viskositeettitermin lähestyessä nollaa. Kappaleessa esitellään Tosio Katon tulos, jossa annetaan ekvivalentteja ehtoja sille, että Navier-Stokesin yhtälön heikko ratkaisu suppenee viskositeetin supetessa kohti Eulerin yhtälön ratkaisua. Tämä tulos todistetaan tutkiel- massa yksityiskohtaisesti. Lopuksi työn viimeisessä kappaleessa esitellään James. P. Kelliherin lisäykset Katon tuloksiin, jotka näyttävät, että Navier-Stokesin yhtälön ratkaisun u gradientti ∇u voidaan korvata ratkaisun u pyörteisyydellä ω(u). Kuten aiemmassa kappaleessa, niin myös tämä tulos esitellään yksityiskohtaisesti työssä. Työssä on vaadittu laajaa ymmärrystä monelta eri matematiikan osa-alueelta. Työn toinen kappale sisältää pitkälti analyysin metodeja sivuten muun muassa funktionaalianalyysiä ja funktioavaruuksien teoriaa. Kolmannessa ja neljännessä kappaleessa keskitytään pitkälti osittais- differentiaaliyhtälöiden teoriaan. Lisäksi työssä käsitellään laajalti myös reaalianalyysin aiheita. Päälähteinä työssä on käytetty Lawrence C. Evansin ”Partial Differential Equations” -teosta, Tosio Katon artikkelia ”Remarks on Zero Viscosity Limit” ja James P. Kelliherin artikkelia ”On Kato’s conditions for vanishing viscosity limit”.