Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by master's degree program "Magisterprogrammet i matematik och statistik"

Sort by: Order: Results:

  • Ronkainen, Arttu (2023)
    Gaussiset prosessit ovat stokastisia prosesseja, joiden äärelliset osajoukot noudattavat multinormaa- lijakaumaa. Niihin pohjautuvien mallien käyttö on suosittua bayesiläisessä tilastotieteessä, sillä ne mahdollistavat monimutkaisten ajallisten tai avaruudellisten riippuvuuksien joustavan mallintami- sen. Gaussisen latentin muuttujan malleissa havaintojen oletetaan noudattavan ehdollista jakaumaa, joka riippuu priorijakaumaltaan gaussisen latentin prosessin saamista arvoista. Havaintoaineiston koostuessa kategorisista arvoista, ovat gaussisen latentin muuttujan mallit ovat laskennallisesti han- kalia, sillä latenttien muuttujien posteriorijakaumaa ei yleensä voida käsitellä analyyttisesti. Täl- löin posterioripäättelyyn on käytettävä analyyttisiä approksimaatioita tai numeerisia menetelmiä. Laskennalliset hankaluudet korostuvat entisestään, kun latentin gaussisen muuttujan kovarianssi- funktion parametreille asetetaan oma priorijakauma. Tässä työssä käsitellään approksimatiivisia menetelmiä, joita voidaan käyttää posterioripäättelyyn gaussisen latentin muuttujan malleissa. Työssä keskitytään pääasiallisesti usean luokan luokitte- lumalliin, jossa havaintomallina on softmax-funktio, mutta suuri osa esitellyistä ideoista on so- vellettavissa myös muille havaintomalleille. Tutkielmassa käsiteltäviä approksimatiivisia menetel- miä on kolme. Ensimmäinen menetelmä on bayesiläisessä tilastotieteessä usein käytetty, satunnai- sotantaan perustuva Markovin ketju Monte Carlo -menetelmä, joka on asymptoottisesti eksakti, mutta laskennallisesti raskas. Toinen menetelmä käyttää Laplace-approksimaatioksi kutsuttua ana- lyyttista approksimaatiota latentin muuttujan posteriorille, yhdessä Markovin ketju Monte Carlo -menetelmän kanssa. Kolmas menetelmä yhdistää Laplace-approksimaation ja hyperparametrien piste-estimoinnin. Käsiteltävien menetelmien perustana oleva teoria esitellään tutkielmassa mini- maalisesti, jonka jälkeen approksimatiivisten menetelmien suoriutumista vertaillaan usean luokan luokittelumallis- sa simuloidulla havaintoaineistolla. Vertailussa voidaan havaita Laplace-approksimaation vaikutus hyperparametrien, sekä latentin muuttujan posteriorijakaumiin.
  • Laiho, Aleksi (2022)
    In statistics, data can often be high-dimensional with a very large number of variables, often larger than the number of samples themselves. In such cases, selection of a relevant configuration of significant variables is often needed. One such case is in genetics, especially genome-wide association studies (GWAS). To select the relevant variables from high-dimensional data, there exists various statistical methods, with many of them relating to Bayesian statistics. This thesis aims to review and compare two such methods, FINEMAP and Sum of Single Effects (SuSiE). The methods are reviewed according to their accuracy of identifying the relevant configurations of variables and their computational efficiency, especially in the case where there exists high inter-variable correlations within the dataset. The methods were also compared to more conventional variable selection methods, such as LASSO. The results show that both FINEMAP and SuSiE outperform LASSO in terms of selection accuracy and efficiency, with FINEMAP producing sligthly more accurate results with the expense of computation time compared to SuSiE. These results can be used as guidelines in selecting an appropriate variable selection method based on the study and data.
  • Pyry, Silomaa (2024)
    This thesis is an empirical comparison of various methods of statistical matching applied to Finnish income and consumption data. The comparison is performed in order to map out some possible matching strategies for Statistics Finland to use in this imputation task and compare the applicability of the strategies within specific datasets. For Statistics Finland, the main point of performing these imputations is in assessing consumption behaviour in years when consumption-related data is not explicitly collected. Within this thesis I compared the imputation of consumption data by imputing 12 consumption variables as well as their sum using the following matching methods: draws from the conditional distribution distance hot deck, predictive mean matching, local residual draws and a gradient boosting approach. The used donor dataset is a sample of households collected for the 2016 Finnish Household Budget Survey (HBS). The recipient dataset is a sample of households collected for the 2019 Finnish Survey of Income and Living Conditions (EU-SILC). In order to assess the quality of the imputations, I used numerical and visual assessments concerning the similarity of the weighted distributions of the consumption variables. The applied numerical assessments were the Kolmogorov-Smirnov (KS) test statistic as well as the Hellinger Distance (HD), the latter of which was calculated for a categorical transformation of the consumption variables. Additionally, the similarities of the correlation matrices were assessed using correlation matrix distance. Generally, distance hot deck and predictive mean matching fared relatively well in the imputation tasks. For example, in the imputation of transport-related expenditure, both produced KS test statistics of approximately 0.01-0.02 and HD of approximately 0.05, whereas the next best-performing method received scores of 0.04 and 0.09, thus representing slightly larger discrepancies. Comparing the two methods, particularly in the imputation of semicontinuous consumption variables, distance hot deck fared notably better than the predictive mean matching approach. As an example, in the consumption expenditure of alcoholic beverages and tobacco, distance hot deck produced values of the KS test statistic and HD of approximately 0.01 and 0.02 respectively whereas the corresponding scores for predictive mean matching were 0.21 and 0.16. Eventually, I would recommend for further application a consideration of both predictive mean matching and distance hot deck depending on the imputation task. This is because predictive mean matching can be applied more easily in different contexts but in certain kinds of imputation tasks distance hot deck clearly outperforms predictive mean matching. Further assessment for this data should be done, in particular the results should be validated with additional data.
  • Mäkinen, Sofia (2023)
    In this thesis we consider the inverse problem for the one-dimensional wave equation. That is, we would like to recover the velocity function, the wave speed, from the equation given Neumann and Dirichlet boundary conditions, when the solution to the equation is known. It has been shown that an operator Λ corresponding to the boundary conditions determines the volumes of the domain of influence, which is the set where the travel time for the wave is limited. These volumes then in turn determine the velocity function. We present some theorems and propositions about determining the wave speed and present proofs for a few of them. Artificial neural networks are a form of machine learning widely used in various applications. It has been previously proven that a one-layer feedforward neural network with a non-polynomial activation function with some additional constraints on the activation function can approximate any continuous real valued functions. In this thesis we present proof of this result for a continuous non-polynomial activation function. Furthermore, in this thesis we apply two neural network architectures to the volume inversion problem, which means that we train the networks to approximate a single volume when the operator Λ is given. The neural networks in question are the feedforward neural network and the operator recurrent neural network. Before the volume inversion problem, we consider a simpler problem of finding an inverse matrix of a small invertible matrix. Finally, we compare the performances of these two neural networks for both the volume and matrix inversion problems.
  • Järvinen, Vili (2024)
    Työ käsittelee tunnettujen virtausdynamiikan yhtälöiden, Navier-Stokesin yhtälöiden ja Eulerin yhtälöiden välistä yhteyttä ja näiden ratkaisujen välisiä suppenemisehtoja. Työn ensimmäisessä kappaleessa esitellään työlle tärkeät perustiedot sisältäen esimerkiksi heikon derivaatan ja Sobolev-avaruuksien määritelmät ja useamman tärkeän funktioavaruuden ja jälkilauseiden määritelmät. Työn toinen kappale käsittelee tarkemmin Navier-Stokesin yhtälöitä ja Eulerin yhtälöitä. Kappaleessa esitellään ensin Navier-Stokesin yhtälöiden määritelmät ja sen jälkeen esitellään määritelmä ratkaisun olemassaololle. Kappaleen päätteeksi esitellään myös Eulerin yhtälön määritelmä. Neljännessä kappaleessa esitellään tutkielman pääaihe, eli Navier-Stokesin ja Eulerin yhtälöiden ratkaisujen välinen yhteys viskositeettitermin lähestyessä nollaa. Kappaleessa esitellään Tosio Katon tulos, jossa annetaan ekvivalentteja ehtoja sille, että Navier-Stokesin yhtälön heikko ratkaisu suppenee viskositeetin supetessa kohti Eulerin yhtälön ratkaisua. Tämä tulos todistetaan tutkiel- massa yksityiskohtaisesti. Lopuksi työn viimeisessä kappaleessa esitellään James. P. Kelliherin lisäykset Katon tuloksiin, jotka näyttävät, että Navier-Stokesin yhtälön ratkaisun u gradientti ∇u voidaan korvata ratkaisun u pyörteisyydellä ω(u). Kuten aiemmassa kappaleessa, niin myös tämä tulos esitellään yksityiskohtaisesti työssä. Työssä on vaadittu laajaa ymmärrystä monelta eri matematiikan osa-alueelta. Työn toinen kappale sisältää pitkälti analyysin metodeja sivuten muun muassa funktionaalianalyysiä ja funktioavaruuksien teoriaa. Kolmannessa ja neljännessä kappaleessa keskitytään pitkälti osittais- differentiaaliyhtälöiden teoriaan. Lisäksi työssä käsitellään laajalti myös reaalianalyysin aiheita. Päälähteinä työssä on käytetty Lawrence C. Evansin ”Partial Differential Equations” -teosta, Tosio Katon artikkelia ”Remarks on Zero Viscosity Limit” ja James P. Kelliherin artikkelia ”On Kato’s conditions for vanishing viscosity limit”.
  • Shabani, Mirjeta (2024)
    A continuous-time Markov chain is a stochastic process which has the Markov property. The Markov property states that the transition to a next state of the process only depends on the current state, that is, it does not depend on the process’ preceding states. Continuous-time Markov Chains are fundamental tools to model stochastic systems in finance and insurance such as option pricing and modelling insurance claim processes. This thesis examines continuous-time Markov chains and their most important concepts and typical properties. For instance, we introduce and investigate the Kolmogorov forward and backward equations, which are essential for continuous-time systems. However, the main aim of the thesis is to present a method and proof for constructing a Markov process from continuous transition intensity matrix. This is achieved by generating a transition probability matrix from given transition intensity matrix. When the transition intensities are known, the challenge is to determine the transition probabilities since the calculations can easily become difficult to solve analytically. Through the introduced theorem it becomes possible to simplify the calculations by approximations. In this thesis, we also make applications of the theory. We demonstrate how determining transition probabilities using Kolmogorov’s forward equations can become challenging in a simple setup. Furthermore, we will compare the approximations of transition probabilities derived from the main theorem to the actual transition probabilities. We make observations about the theorem’s transition probability function; the approximations derived from the main theorem provides quite satisfactory estimates of the actual transition probabilities.
  • Takanen, Emilia (2023)
    Työssä todistetaan Delignen–Mumfordin kompaktifikaatiolause. Delignen–Mumfordin kompaktifikaatiolause sanoo, että tietyillä ehdolla jonolle saman signaturen hyperbolisia pintoja on olemassa rajapinta, josta on olemassa diffeomorfismit jokaiseen tämän jonon jäseneen ja että näillä diffeomorfismeilla nykäistyt metriikat suppenevat rajapinnalla rajametriikkaan. Työssä Delignen–Mumfordin kompaktifikaatiolause todistetaan osoittamalla vastaava tulos ensin hyperbolisten pintojen yksinkertaisille rakennuspalikoille. Työ todistaa vastaavan tuloksen ensin Y -palan kanonisille kauluksille ja käyttää tätä tulosta todistamaan vastaavan tuloksen Y -paloille. Tämän jälkeen työ todistaa jokaisen hyperbolisen pinnan olevan rakennettavissa Y -paloista antaen välittömästi tuloksen kaikille hyperbolisille pinnoille.
  • Miinalainen, Lumi (2024)
    Sileät monistot laajentavat matemaattisen analyysin keinoja euklidisista avaruuksista yleisemmille topologisille avaruuksille. De Rhamin lause lisää tähän vielä yhteyden algebralliseen topologiaan näyttämällä, että tietyt monistojen topologiset invariantit voidaan karakterisoida joko analyysin tai topologian keinoin. Toisin sanottuna moniston analyyttiset ominaisuudet paljastavat jotain sen topologisista ominaisuuksista ja päinvastoin. Tässä gradussa esitetään De Rhamin lauseelle kaksi todistusta. Ensimmäinen niistä todistaa lauseen sen klassisessa muodossa, joka vaatii vain monistojen ja singulaarihomologian perusteorian ymmärtämisen. Toinen todistus on muotoiltu hyvin yleisesti lyhteiden avulla; tarvittava lyhteiden teoria esitellään lähes kokonaan tekstissä. Tämä rakenne jakaa tekstin luontevasti kahtia. Ensimmäisessä osassa kerrataan ensin lyhyesti de Rhamin kohomologian ja singulaarihomologian perusteet. Seuraavaksi esitellään singulaarikohomologia sekä ketjujen integrointi monistoilla, jotka johtavat klassisen de Rhamin lauseen todistukseen. Toisessa osassa tutustutaan aluksi esilyhteiden ja lyhteiden teoriaan. Sitten esitellään lyhdekoho- mologiateoriat ja niiden yhteys ensimmäisen osan kohomologiaryhmiin. Lopulta näytetään, että kaikki lyhdekohomologiateoriat ovat yksikäsitteisesti isomorfisia keskenään. De Rhamin kohomologian ja singulaarikohomologian tapauksessa tälle isomorfismille annetaan lisäksi suoraviivainen konstruktio.
  • Särkijärvi, Joona (2023)
    Both descriptive combinatorics and distributed algorithms are interested in solving graph problems with certain local constraints. This connection is not just superficial, as Bernshteyn showed in his seminal 2020 paper. This thesis focuses on that connection by restating the results of Bernshteyn. This work shows that a common theory of locality connects these fields. We also restate the results that connect these findings to continuous dynamics, where they found that solving a colouring problem on the free part of the subshift 2^Γ is equivalent to there being a fast LOCAL algorithm solving this problem on finite sections of the Cayley graph of Γ. We also restate the result on the continuous version of Lovász Local Lemma by Bernshteyn. The LLL is a powerful probabilistic tool used throughout combinatorics and distributed computing. They proved a version of the lemma that, under certain topological constraints, produces continuous solutions.
  • Kauppala, Tuuli (2021)
    Children’s height and weight development remains a subject of interest especially due to increasing prevalence of overweight and obesity in the children. With statistical modeling, height and weight development can be examined as separate or connected outcomes, aiding with understanding of the phenomenon of growth. As biological connection between height and weight development can be assumed, their joint modeling is expected to be beneficial. One more advantage of joint modeling is its convenience of the Body Mass Index (BMI) prediction. In the thesis, we modeled longitudinal data of children’s heights and weights of the dataset obtained from Finlapset register of the Institute of Health and Welfare (THL). The research aims were to predict the modeled quantities together with the BMI, interpret the obtained parameters with relation to the phenomenon of growth, as well as to investigate the impact of municipalities on to the growth of children. The dataset’s irregular, register-based nature together with positively skewed, heteroschedastic weight distributions and within- and between-subject variability suggested Hierarchical Linear Models (HLMs) as the modeling method of choice. We used HLMs in Bayesian setting with the benefits of incorporating existing knowledge, and obtaining full posterior predictive distribution for the outcome variables. HLMs were compared with the less suitable classical linear regression model, and bivariate and univariate HLMs with or without area as a covariate were compared in terms of their posterior predictive precision and accuracy. One of the main research questions was the model’s ability to predict the BMI of the child, which we assessed with various posterior predictive checks (PPC). The most suitable model was used to estimate growth parameters of 2-6 year old males and females in Vihti, Kirkkonummi and Tuusula. With the parameter estimates, we could compare growth of males and females, assess the differences of within-subject and between-subject variability on growth and examine correlation between height and weight development. Based on the work, we could conclude that the bivariate HLM constructed provided the most accurate and precise predictions especially for the BMI. The area covariates did not provide additional advantage to the models. Overall, Bayesian HLMs are a suitable tool for the register-based dataset of the work, and together with log-transformation of height and weight they can be used to model skewed and heteroschedastic longitudinal data. However, the modeling would ideally require more observations per individual than we had, and proper out-of-sample predictive evaluation would ensure that current models are not over-fitted with regards to the data. Nevertheless, the built models can already provide insight into contemporary Finnish childhood growth and to simulate and create predictions for the future BMI population distributions.
  • Frosti, Miika (2022)
    Tämä tutkielma käsittelee C^2:n hyperbolisessa yksikkökuulassa asetettuja Dirichlet'n ongelmia. Työn tavoitteena on löytää ongelman ratkaisujen joukosta ne funktiot, jotka ovat sileitä, eli rajattomasti derivoituvia. Tätä varten kuvaillaan aluksi R^2:n yksikköympyrässä ja puoliavaruudessa määritellyt Dirichlet'n ongelmat ja miten muodostaa niille ratkaisut. Molempien alueiden ongelmia varten luodaan aluekohtaiset Greenin funktiot, joiden avulla johdetaan Poissonin ydin. Tämän ytimen avulla saadaan sileä ratkaisu Dirichlet'n ongelmaan. Tämän jälkeen tutustutaan C^2:n hyperboliseen yksikkökuulaan, ja miten siinä määritellyt Dirichlet'n ongelmat eroavat R^2:n yksikkökuulan ongelmista. Aiheen kannalta merkittävintä on ero euklidisen ja hyperbolisen Laplace-Beltramin operaattorin ominaisuuksissa. Kun tärkeimmät eroavaisuudet ovat selvitetty, voidaan todistaa, että Poisson-Szegön ytimen avulla määritelty funktio ratkaisee Dirichlet'n ongelman. On kuitenkin mahdollista näyttää esimerkillä, että ratkaisut eivät ole välttämättä sileitä. Jotta näistä ratkaisuista voidaan erottaa sileät funktiot, on hyödynnettävä palloharmonisia funktioita. Näiden tärkeimpiä piirteitä kuvaillaan sekä reaaliavaruudessa että kompleksiavaruudessa. Näiden funktioiden ja hypergeometristen funktioiden avulla voidaan määritellä uusi muoto Poisson-Szegön ytimelle, josta voidaan puolestaan johtaa tutkielman lopputulos. Kyseiseksi lopputulokseksi saadaan se, että yksikkökuulan Dirichlet'n ongelmien ratkaisut ovat sileitä jos ja vain jos ratkaisut ovat pluriharmonisia.
  • Virri, Maria (2021)
    Bonus-malus systems are used globally to determine insurance premiums of motor liability policy-holders by observing past accident behavior. In these systems, policy-holders move between classes that represent different premiums. The number of accidents is used as an indicator of driving skills or risk. The aim of bonus-malus systems is to assign premiums that correspond to risks by increasing premiums of policy-holders that have reported accidents and awarding discounts to those who have not. Many types of bonus-malus systems are used and there is no consensus about what the optimal system looks like. Different tools can be utilized to measure the optimality, which is defined differently according to each tool. The purpose of this thesis is to examine one of these tools, elasticity. Elasticity aims to evaluate how well a given bonus-malus system achieves its goal of assigning premiums fairly according to the policy-holders’ risks by measuring the response of the premiums to changes in the number of accidents. Bonus-malus systems can be mathematically modeled using stochastic processes called Markov chains, and accident behavior can be modeled using Poisson distributions. These two concepts of probability theory and their properties are introduced and applied to bonus-malus systems in the beginning of this thesis. Two types of elasticities are then discussed. Asymptotic elasticity is defined using Markov chain properties, while transient elasticity is based on a concept called the discounted expectation of payments. It is shown how elasticity can be interpreted as a measure of optimality. We will observe that it is typically impossible to have an optimal bonus-malus system for all policy-holders when optimality is measured using elasticity. Some policy-holders will inevitably subsidize other policy-holders by paying premiums that are unfairly large. More specifically, it will be shown that, for bonus-malus systems with certain elasticity values, lower-risk policy-holders will subsidize the higher-risk ones. Lastly, a method is devised to calculate the elasticity of a given bonus-malus system using programming language R. This method is then used to find the elasticities of five Finnish bonus-malus systems in order to evaluate and compare them.
  • Heikkuri, Vesa-Matti (2022)
    This thesis studies equilibrium in a continuous-time overlapping generations (OLG) model. OLG models are used in economics to study the effect of demographics and life-cycle behavior on macroeconomic variables such as the interest rate and aggregate investment. These models are typically set in discrete time but continuous-time versions have also received attention recently for their desirable properties. Competitive equilibrium in a continuous-time OLG model can be represented as a solution to an integral equation. This equation is linear in the special case of logarithmic utility function. This thesis provides the necessary and sufficient conditions under which the linear equation is a convolution type integral equation and derives a distributional solution using Fourier transform. We also show that the operator norm of the integral operator is not generally less than one. Hence, the equation cannot be solved using Neumann series. However, in a special case the distributional solution is characterized by a geometric series on the Fourier side when the operator norm is equal to one.
  • Lahdensuo, Sofia (2022)
    The Finnish Customs collects and maintains the statistics of the Finnish intra-EU trade with the Intrastat system. Companies with significant intra-EU trade are obligated to give monthly Intrastat declarations, and the statistics of the Finnish intra-EU trade are compiled based on the information collected with the declarations. In case of a company not giving the declaration in time, there needs to exist an estimation method for the missing values. In this thesis we propose an automatic multivariate time series forecasting process for the estimation of the missing Intrastat import and export values. The forecasting is done separately for each company with missing values. For forecasting we use two dimensional time series models, where the other component is the import or export value of the company to be forecasted, and the other component is the import or export value of the industrial group of the company. To complement the time series forecasting we use forecast combining. Combined forecasts, for example the averages of the obtained forecasts, have been found to perform well in terms of forecast accuracy compared to the forecasts created by individual methods. In the forecasting process we use two multivariate time series models, the Vector Autoregressive (VAR) model, and a specific VAR model called the Vector Error Correction (VEC) model. The choice of the model is based on the stationary properties of the time series to be modelled. An alternative option for the VEC model is the so-called augmented VAR model, which is an over-fitted VAR model. We use the VEC model and the augmented VAR model together by using the average of the forecasts created with them as the forecast for the missing value. When the usual VAR model is used, only the forecast created by the single model is used. The forecasting process is created as automatic and as fast as possible, therefore the estimation of a time series model for a single company is made as simple as possible. Thus, only statistical tests which can be applied automatically are used in the model building. We compare the forecast accuracy of the forecasts created with the automatic forecasting process to the forecast accuracy of forecasts created with two simple forecasting methods. In the non-stationary-deemed time series the Naïve forecast performs well in terms of forecast accuracy compared to the time series model based forecasts. On the other hand, in the stationary-deemed time series the average over the past 12 months performs well as a forecast in terms of forecast accuracy compared to the time series model based forecasts. We also consider forecast combinations where the forecast combinations are created by calculating the average of the time series model based forecasts and the simple forecasts. In line with the literature, the forecast combinations perform overall better in terms of the forecast accuracy than the forecasts based on the individual models.
  • Nikkanen, Leo (2022)
    Often in spatial statistics the modelled domain contains physical barriers that can have impact on how the modelled phenomena behaves. This barrier can be, for example, land in case of modelling a fish population, or road for different animal populations. Common model that is used in spatial statistics is a stationary Gaussian model, because of its computational requirements, relatively easy interpretation of results. The physical barrier does not have an effect on this type of models unless the barrier is transformed into variable, but this can cause issues in the polygon selection. In this thesis I discuss how the non-stationary Gaussian model can be deployed in cases where spatial domain contains physical barriers. This non-stationary model reduces spatial correlation continuously towards zero in areas that are considered as a physical barrier. When the correlation is selected to reduce smoothly to zero, the model is more likely to results similar output with slightly different polygons. The advantage of the barrier model is that it is as fast to train as the stationary model because both models can be trained using finite equation method (FEM). With FEM we can solve stochastic partial differential equations (SPDE). This method interprets continuous random field as a discrete mesh, and the computational requirements increases as the number of nodes in mesh increases. In order to create stationary and non-stationary models, I have described the required methods such as Bayesian statistics, stochastic process, and covariance function in the second chapter. I use these methods to define spatial random effect model, and one commonly used spatial model is the Gaussian latent variable model. At the end of second chapter, I describe how the barrier model is created, and what types of requirements this model has. The barrier model is based on a Matern model that is a Gaussian random field, and it can be represented by using Matern covariance function. The second chapter ends with description of how to create a mesh mentioned above, and how the FEM is used to solve SPDE. The performance of stationary and non-stationary Gaussian models are first tested by training both models with simulated data. This simulated data is a random sample from polygon of Helsinki where the coastline is interpreted as a physical barrier. The results show that the barrier model estimates the true parameters better than the stationary model. The last chapter contains data analysis of the rat populations in Helsinki. The data contains number of rat observations in each zip code, and a set of covariates. Both models, stationary and non-stationary, are trained with and without covariates, and the best model out of these four models was the stationary model with covariates.
  • Sohkanen, Pekka (2021)
    The fields of insurance and financial mathematics require increasingly intricate descriptors of dependency. In the realm of financial mathematics, this demand arises from globalisation effects over the past decade, which have caused financial asset returns to exhibit increasingly intricate dependencies between each other. Of particular interest are measurements describing the probabilities of simultaneous occurrences between unusually negative stock returns. In insurance mathematics, the ability to evaluate probabilities associated with the simultaneous occurrence of unusually large claim amounts can be crucial for both the solvency and the competitiveness of an insurance company. These sorts of dependencies are referred to by the term tail dependence. In this thesis, we introduce the concept of tail dependence and the tail dependence coefficient, a tool for determining the amount of tail dependence between random variables. We also present statistical estimators for the tail dependence coefficient. Favourable properties of these estimators are investigated and a simulation study is executed in order to evaluate and compare estimator performance under a variety of distributions. Some necessary stochastics concepts are presented. Mathematical models of dependence are introduced. Elementary notions of extreme value theory and empirical processes are touched on. These motivate the presented estimators and facilitate the proofs of their favourable properties.
  • Patieva, Fatima (2023)
    In this thesis, we study epidemic models such as SIR and superinfection to demonstrate the coexistence as well as the competitive exclusion of all but one strain. We show that the strain that can keep its position under the worst environmental conditions cannot be invaded by any other strain when it comes to some models with a constant death rate. Otherwise, the optimization principle does not necessarily work. Nevertheless, Ackleh and Allen proved that in the SIR model with a density-dependent mortality rate and total cross-immunity the strain with the largest basic reproduction number is the winner in competitive exclusion. However, it must be taken into account that the conditions on the parameters used for the proof are sufficient but not necessary to exclude the coexistence of different pathogen strains. We show that the method can be applied to both density-dependent and frequency-dependent transmission incidence. In the latter half, we link the between and within-host models and expand the nested model to allow for superinfection. The introduction of the basic notions of adaptive dynamics contributes to simplifying our task of demonstrating the evolutionary branching leading to diverging dimorphism. The precise conclusions about the outcome of evolution will depend on the host demography as well as on the class of superinfection and the shape of transmission functions.
  • Koutsompinas, Ioannis Jr (2021)
    In this thesis we study extension results related to compact bilinear operators in the setting of interpolation theory and more specifically the complex interpolation method, as introduced by Calderón. We say that: 1. the bilinear operator T is compact if it maps bounded sets to sets of compact closure. 2.\bar{ A} = (A_0,A_1) is a Banach couple if A_0,A_1 are Banach spaces that are continuously embedded in the same Hausdorff topological vector space. Moreover, if (Ω,\mathcal{A}, μ) is a σ-finite measure space, we say that: 3. E is a Banach function space if E is a Banach space of scalar-valued functions defined on Ω that are finite μ-a.e. and so that the norm of E is related to the measure μ in an appropriate way. 4. the Banach function space E has absolutely continuous norm if for any function f ∈ E and for any sequence (Γ_n)_{n=1}^{+∞}⊂ \mathcal{A} satisfying χ_{Γn} → 0 μ-a.e. we have that ∥f · χ_{Γ_n}∥_E → 0. Assume that \bar{A} and \bar{B} are Banach couples, \bar{E} is a couple of Banach function spaces on Ω, θ ∈ (0, 1) and E_0 has absolutely continuous norm. If the bilinear operator T : (A_0 ∩ A_1) × (B_0 ∩ B_1) → E_0 ∩ E_1 satisfies a certain boundedness assumption and T : \tilde{A_0} × \tilde{B_0} → E_0 compactly, we show that T may be uniquely extended to a compact bilinear operator T : [A_0,A_1]_θ × [B_0,B_1]_θ → [E_0,E_1]_θ where \tilde{A_j} denotes the closure of A_0 ∩ A_1 in A_j and [A_0,A_1]_θ denotes the complex interpolation space generated by \bar{A}. The proof of this result comes after we study the case where the couple of Banach function spaces is replaced by a single Banach space.
  • Malila, Saara (2024)
    The presence of 1/f type noise in a variety of natural processes and human cognition is a well-established fact, and methods of analysing it are many. Fractal analysis of time series data has long been subject to limitations due to the inaccuracy of results for small datasets and finite data. The development of artificial intelligence and machine learning algorithms over the recent years have opened the door to modeling and forecasting such phenomena as well which we do not yet have a complete understanding of. In this thesis principal component analysis is used to detect 1/f noise patterns in human-played drum beats typical to a style of playing. In the future, this type of analysis could be used to construct drum machines that mimic the fluctuations in timing associated with a certain characteristic in human-played music such as genre, era, or musician. In this study the link between 1/f-noisy patterns of fluctuations in timing and the technical skill level of the musician is researched. Samples of isolated drum tracks are collected and split into two groups representing either low or high level of technical skill. Time series vectors are then constructed by hand to depict the actual timing of the human-played beats. Difference vectors are then created for analysis by using the least-squares method to find the corresponding "perfect" beat and subtracting them from the collected data. These resulting data illustrate the deviation of the actual playing from the beat according to a metronome. A principal component analysis algorithm is then run on the power spectra of the difference vectors to detect points of correlation within different subsets of the data, with the focus being on the two groups mentioned earlier. Finally, we attempt to fit a 1/f noise model to the principal component scores of the power spectra. The results of the study support our hypothesis but their interpretation on this scale appears subjective. We find that the principal component of the power spectra of the more skilled musicians' samples can be approximated by the function $S=1/f^{\alpha}$ with $\alpha\in(0,2)$, which is indicative of fractal noise. Although the less skilled group's samples do not appear to contain 1/f-noisy fluctuations, its subsets do quite consistently. The opposite is true for the first-mentioned dataset. All in all, we find that a much larger dataset is required to construct a reliable model of human error in recorded music, but with the small amount of data in this study we show that we can indeed detect and isolate defining rhythmic characteristics to a certain style of playing drums.
  • Litmanen, Jenna (2023)
    Tiivistelmä – Referat – Abstract Tässä työssä on tarkoituksena esittää Fukushiman hajotelma, jota voidaan käyttää yleistyksenä Itôn lemmalle. Ensimmäisessä luvussa käydään läpi perusteita stokastiselle analyysille. Työ etenee stokastisen analyysin perusteista Markovin prosesseihin ja tähän liittyviin käsitteisiin. Käydään läpi additiivisen funktionaalin käsite ja miten se liittyy käsiteltäviin prosesseihin. Martingaalien kohdalla käydään läpi peruskäsitteet. Tämän jälkeen siirrytään käsittelemään Itôn lemmaan ja tämän todistukseen. Itôn lemma on tärkeä työkalu taloustieteessä, etenkin kun työskennellään varallisuushintojen ja osakemarkkinoiden parissa. Itôn lemma luo pohja sille, kuinka varallisuushinnat voidaan määritellä Brownin liikkeen avulla. Samassa luvussa käsitellään myös muita hyödyllisiä stokastisen analyysin työkaluja. Yksi tällainen työkalu on Doobin-Meyer’n hajotelma martingaaleille ja ennustettavissa oleville prosesseille. Hajotelma on tärkeä työkalu, kun siirrytään korkeammalle tasolle stokastisten yhtälöiden kanssa. Ensimmäisen luvun lopussa käsitellään Sobolevin avaruutta, Dirichlet’n avaruutta ja Dirichlet’n muotoja. Näiden tarkoituksena on valmistaa lukijaa pohjatiedoiltaan seuraavaan lukuun, jossa käsitellään yhtä työn päälauseista Toisessa luvussa käsitellään additiivisen funktionaalin ja martingaaliadditiivisen funktionaalin energiaa ja Radon mittaa. Näiden käsittelyn jälkeen, siirrytään Itôn lemman yleistyksen pariin. Lopulta käsitellään yleistystä Itôn lemmalle. Yleistyksen pohjalla on mahdollisuus ottaa lauseesta “heikompi” versio, jolloin kaikkein vahvimpien ehtojen ja oletusten ei välttämättä tarvitse olla voimassa. Tämä on tärkeää, sillä Itôn lemman ehtona on jatkuvasti kahdesti differentioituvuus, joka ei läheskään aina toteudu stokastisissa prosesseissa. Näin ollen voidaan saavuttaa Itôn lemman edut kevyemmillä ehdoilla. Lopulta käsitellään Fukushiman hajotelmaa, joka on käytännöllinen prosesseille, jotka ovat semimartingaaleja. Fukushiman hajotelman avulla voidaan käsitellä tapauksia, joissa aiemmin käsiteltyjen lauseiden oletukset eivät täyty. Fukushiman hajotelma saadaan rakennettua aiemmin esitellyn lauseen avulla.