Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by master's degree program "Matematiikan ja tilastotieteen maisteriohjelma"

Sort by: Order: Results:

  • Takanen, Emilia (2023)
    Työssä todistetaan Delignen–Mumfordin kompaktifikaatiolause. Delignen–Mumfordin kompaktifikaatiolause sanoo, että tietyillä ehdolla jonolle saman signaturen hyperbolisia pintoja on olemassa rajapinta, josta on olemassa diffeomorfismit jokaiseen tämän jonon jäseneen ja että näillä diffeomorfismeilla nykäistyt metriikat suppenevat rajapinnalla rajametriikkaan. Työssä Delignen–Mumfordin kompaktifikaatiolause todistetaan osoittamalla vastaava tulos ensin hyperbolisten pintojen yksinkertaisille rakennuspalikoille. Työ todistaa vastaavan tuloksen ensin Y -palan kanonisille kauluksille ja käyttää tätä tulosta todistamaan vastaavan tuloksen Y -paloille. Tämän jälkeen työ todistaa jokaisen hyperbolisen pinnan olevan rakennettavissa Y -paloista antaen välittömästi tuloksen kaikille hyperbolisille pinnoille.
  • Särkijärvi, Joona (2023)
    Both descriptive combinatorics and distributed algorithms are interested in solving graph problems with certain local constraints. This connection is not just superficial, as Bernshteyn showed in his seminal 2020 paper. This thesis focuses on that connection by restating the results of Bernshteyn. This work shows that a common theory of locality connects these fields. We also restate the results that connect these findings to continuous dynamics, where they found that solving a colouring problem on the free part of the subshift 2^Γ is equivalent to there being a fast LOCAL algorithm solving this problem on finite sections of the Cayley graph of Γ. We also restate the result on the continuous version of Lovász Local Lemma by Bernshteyn. The LLL is a powerful probabilistic tool used throughout combinatorics and distributed computing. They proved a version of the lemma that, under certain topological constraints, produces continuous solutions.
  • Kauppala, Tuuli (2021)
    Children’s height and weight development remains a subject of interest especially due to increasing prevalence of overweight and obesity in the children. With statistical modeling, height and weight development can be examined as separate or connected outcomes, aiding with understanding of the phenomenon of growth. As biological connection between height and weight development can be assumed, their joint modeling is expected to be beneficial. One more advantage of joint modeling is its convenience of the Body Mass Index (BMI) prediction. In the thesis, we modeled longitudinal data of children’s heights and weights of the dataset obtained from Finlapset register of the Institute of Health and Welfare (THL). The research aims were to predict the modeled quantities together with the BMI, interpret the obtained parameters with relation to the phenomenon of growth, as well as to investigate the impact of municipalities on to the growth of children. The dataset’s irregular, register-based nature together with positively skewed, heteroschedastic weight distributions and within- and between-subject variability suggested Hierarchical Linear Models (HLMs) as the modeling method of choice. We used HLMs in Bayesian setting with the benefits of incorporating existing knowledge, and obtaining full posterior predictive distribution for the outcome variables. HLMs were compared with the less suitable classical linear regression model, and bivariate and univariate HLMs with or without area as a covariate were compared in terms of their posterior predictive precision and accuracy. One of the main research questions was the model’s ability to predict the BMI of the child, which we assessed with various posterior predictive checks (PPC). The most suitable model was used to estimate growth parameters of 2-6 year old males and females in Vihti, Kirkkonummi and Tuusula. With the parameter estimates, we could compare growth of males and females, assess the differences of within-subject and between-subject variability on growth and examine correlation between height and weight development. Based on the work, we could conclude that the bivariate HLM constructed provided the most accurate and precise predictions especially for the BMI. The area covariates did not provide additional advantage to the models. Overall, Bayesian HLMs are a suitable tool for the register-based dataset of the work, and together with log-transformation of height and weight they can be used to model skewed and heteroschedastic longitudinal data. However, the modeling would ideally require more observations per individual than we had, and proper out-of-sample predictive evaluation would ensure that current models are not over-fitted with regards to the data. Nevertheless, the built models can already provide insight into contemporary Finnish childhood growth and to simulate and create predictions for the future BMI population distributions.
  • Frosti, Miika (2022)
    Tämä tutkielma käsittelee C^2:n hyperbolisessa yksikkökuulassa asetettuja Dirichlet'n ongelmia. Työn tavoitteena on löytää ongelman ratkaisujen joukosta ne funktiot, jotka ovat sileitä, eli rajattomasti derivoituvia. Tätä varten kuvaillaan aluksi R^2:n yksikköympyrässä ja puoliavaruudessa määritellyt Dirichlet'n ongelmat ja miten muodostaa niille ratkaisut. Molempien alueiden ongelmia varten luodaan aluekohtaiset Greenin funktiot, joiden avulla johdetaan Poissonin ydin. Tämän ytimen avulla saadaan sileä ratkaisu Dirichlet'n ongelmaan. Tämän jälkeen tutustutaan C^2:n hyperboliseen yksikkökuulaan, ja miten siinä määritellyt Dirichlet'n ongelmat eroavat R^2:n yksikkökuulan ongelmista. Aiheen kannalta merkittävintä on ero euklidisen ja hyperbolisen Laplace-Beltramin operaattorin ominaisuuksissa. Kun tärkeimmät eroavaisuudet ovat selvitetty, voidaan todistaa, että Poisson-Szegön ytimen avulla määritelty funktio ratkaisee Dirichlet'n ongelman. On kuitenkin mahdollista näyttää esimerkillä, että ratkaisut eivät ole välttämättä sileitä. Jotta näistä ratkaisuista voidaan erottaa sileät funktiot, on hyödynnettävä palloharmonisia funktioita. Näiden tärkeimpiä piirteitä kuvaillaan sekä reaaliavaruudessa että kompleksiavaruudessa. Näiden funktioiden ja hypergeometristen funktioiden avulla voidaan määritellä uusi muoto Poisson-Szegön ytimelle, josta voidaan puolestaan johtaa tutkielman lopputulos. Kyseiseksi lopputulokseksi saadaan se, että yksikkökuulan Dirichlet'n ongelmien ratkaisut ovat sileitä jos ja vain jos ratkaisut ovat pluriharmonisia.
  • Virri, Maria (2021)
    Bonus-malus systems are used globally to determine insurance premiums of motor liability policy-holders by observing past accident behavior. In these systems, policy-holders move between classes that represent different premiums. The number of accidents is used as an indicator of driving skills or risk. The aim of bonus-malus systems is to assign premiums that correspond to risks by increasing premiums of policy-holders that have reported accidents and awarding discounts to those who have not. Many types of bonus-malus systems are used and there is no consensus about what the optimal system looks like. Different tools can be utilized to measure the optimality, which is defined differently according to each tool. The purpose of this thesis is to examine one of these tools, elasticity. Elasticity aims to evaluate how well a given bonus-malus system achieves its goal of assigning premiums fairly according to the policy-holders’ risks by measuring the response of the premiums to changes in the number of accidents. Bonus-malus systems can be mathematically modeled using stochastic processes called Markov chains, and accident behavior can be modeled using Poisson distributions. These two concepts of probability theory and their properties are introduced and applied to bonus-malus systems in the beginning of this thesis. Two types of elasticities are then discussed. Asymptotic elasticity is defined using Markov chain properties, while transient elasticity is based on a concept called the discounted expectation of payments. It is shown how elasticity can be interpreted as a measure of optimality. We will observe that it is typically impossible to have an optimal bonus-malus system for all policy-holders when optimality is measured using elasticity. Some policy-holders will inevitably subsidize other policy-holders by paying premiums that are unfairly large. More specifically, it will be shown that, for bonus-malus systems with certain elasticity values, lower-risk policy-holders will subsidize the higher-risk ones. Lastly, a method is devised to calculate the elasticity of a given bonus-malus system using programming language R. This method is then used to find the elasticities of five Finnish bonus-malus systems in order to evaluate and compare them.
  • Heikkuri, Vesa-Matti (2022)
    This thesis studies equilibrium in a continuous-time overlapping generations (OLG) model. OLG models are used in economics to study the effect of demographics and life-cycle behavior on macroeconomic variables such as the interest rate and aggregate investment. These models are typically set in discrete time but continuous-time versions have also received attention recently for their desirable properties. Competitive equilibrium in a continuous-time OLG model can be represented as a solution to an integral equation. This equation is linear in the special case of logarithmic utility function. This thesis provides the necessary and sufficient conditions under which the linear equation is a convolution type integral equation and derives a distributional solution using Fourier transform. We also show that the operator norm of the integral operator is not generally less than one. Hence, the equation cannot be solved using Neumann series. However, in a special case the distributional solution is characterized by a geometric series on the Fourier side when the operator norm is equal to one.
  • Lahdensuo, Sofia (2022)
    The Finnish Customs collects and maintains the statistics of the Finnish intra-EU trade with the Intrastat system. Companies with significant intra-EU trade are obligated to give monthly Intrastat declarations, and the statistics of the Finnish intra-EU trade are compiled based on the information collected with the declarations. In case of a company not giving the declaration in time, there needs to exist an estimation method for the missing values. In this thesis we propose an automatic multivariate time series forecasting process for the estimation of the missing Intrastat import and export values. The forecasting is done separately for each company with missing values. For forecasting we use two dimensional time series models, where the other component is the import or export value of the company to be forecasted, and the other component is the import or export value of the industrial group of the company. To complement the time series forecasting we use forecast combining. Combined forecasts, for example the averages of the obtained forecasts, have been found to perform well in terms of forecast accuracy compared to the forecasts created by individual methods. In the forecasting process we use two multivariate time series models, the Vector Autoregressive (VAR) model, and a specific VAR model called the Vector Error Correction (VEC) model. The choice of the model is based on the stationary properties of the time series to be modelled. An alternative option for the VEC model is the so-called augmented VAR model, which is an over-fitted VAR model. We use the VEC model and the augmented VAR model together by using the average of the forecasts created with them as the forecast for the missing value. When the usual VAR model is used, only the forecast created by the single model is used. The forecasting process is created as automatic and as fast as possible, therefore the estimation of a time series model for a single company is made as simple as possible. Thus, only statistical tests which can be applied automatically are used in the model building. We compare the forecast accuracy of the forecasts created with the automatic forecasting process to the forecast accuracy of forecasts created with two simple forecasting methods. In the non-stationary-deemed time series the Naïve forecast performs well in terms of forecast accuracy compared to the time series model based forecasts. On the other hand, in the stationary-deemed time series the average over the past 12 months performs well as a forecast in terms of forecast accuracy compared to the time series model based forecasts. We also consider forecast combinations where the forecast combinations are created by calculating the average of the time series model based forecasts and the simple forecasts. In line with the literature, the forecast combinations perform overall better in terms of the forecast accuracy than the forecasts based on the individual models.
  • Nikkanen, Leo (2022)
    Often in spatial statistics the modelled domain contains physical barriers that can have impact on how the modelled phenomena behaves. This barrier can be, for example, land in case of modelling a fish population, or road for different animal populations. Common model that is used in spatial statistics is a stationary Gaussian model, because of its computational requirements, relatively easy interpretation of results. The physical barrier does not have an effect on this type of models unless the barrier is transformed into variable, but this can cause issues in the polygon selection. In this thesis I discuss how the non-stationary Gaussian model can be deployed in cases where spatial domain contains physical barriers. This non-stationary model reduces spatial correlation continuously towards zero in areas that are considered as a physical barrier. When the correlation is selected to reduce smoothly to zero, the model is more likely to results similar output with slightly different polygons. The advantage of the barrier model is that it is as fast to train as the stationary model because both models can be trained using finite equation method (FEM). With FEM we can solve stochastic partial differential equations (SPDE). This method interprets continuous random field as a discrete mesh, and the computational requirements increases as the number of nodes in mesh increases. In order to create stationary and non-stationary models, I have described the required methods such as Bayesian statistics, stochastic process, and covariance function in the second chapter. I use these methods to define spatial random effect model, and one commonly used spatial model is the Gaussian latent variable model. At the end of second chapter, I describe how the barrier model is created, and what types of requirements this model has. The barrier model is based on a Matern model that is a Gaussian random field, and it can be represented by using Matern covariance function. The second chapter ends with description of how to create a mesh mentioned above, and how the FEM is used to solve SPDE. The performance of stationary and non-stationary Gaussian models are first tested by training both models with simulated data. This simulated data is a random sample from polygon of Helsinki where the coastline is interpreted as a physical barrier. The results show that the barrier model estimates the true parameters better than the stationary model. The last chapter contains data analysis of the rat populations in Helsinki. The data contains number of rat observations in each zip code, and a set of covariates. Both models, stationary and non-stationary, are trained with and without covariates, and the best model out of these four models was the stationary model with covariates.
  • Sohkanen, Pekka (2021)
    The fields of insurance and financial mathematics require increasingly intricate descriptors of dependency. In the realm of financial mathematics, this demand arises from globalisation effects over the past decade, which have caused financial asset returns to exhibit increasingly intricate dependencies between each other. Of particular interest are measurements describing the probabilities of simultaneous occurrences between unusually negative stock returns. In insurance mathematics, the ability to evaluate probabilities associated with the simultaneous occurrence of unusually large claim amounts can be crucial for both the solvency and the competitiveness of an insurance company. These sorts of dependencies are referred to by the term tail dependence. In this thesis, we introduce the concept of tail dependence and the tail dependence coefficient, a tool for determining the amount of tail dependence between random variables. We also present statistical estimators for the tail dependence coefficient. Favourable properties of these estimators are investigated and a simulation study is executed in order to evaluate and compare estimator performance under a variety of distributions. Some necessary stochastics concepts are presented. Mathematical models of dependence are introduced. Elementary notions of extreme value theory and empirical processes are touched on. These motivate the presented estimators and facilitate the proofs of their favourable properties.
  • Patieva, Fatima (2023)
    In this thesis, we study epidemic models such as SIR and superinfection to demonstrate the coexistence as well as the competitive exclusion of all but one strain. We show that the strain that can keep its position under the worst environmental conditions cannot be invaded by any other strain when it comes to some models with a constant death rate. Otherwise, the optimization principle does not necessarily work. Nevertheless, Ackleh and Allen proved that in the SIR model with a density-dependent mortality rate and total cross-immunity the strain with the largest basic reproduction number is the winner in competitive exclusion. However, it must be taken into account that the conditions on the parameters used for the proof are sufficient but not necessary to exclude the coexistence of different pathogen strains. We show that the method can be applied to both density-dependent and frequency-dependent transmission incidence. In the latter half, we link the between and within-host models and expand the nested model to allow for superinfection. The introduction of the basic notions of adaptive dynamics contributes to simplifying our task of demonstrating the evolutionary branching leading to diverging dimorphism. The precise conclusions about the outcome of evolution will depend on the host demography as well as on the class of superinfection and the shape of transmission functions.
  • Koutsompinas, Ioannis Jr (2021)
    In this thesis we study extension results related to compact bilinear operators in the setting of interpolation theory and more specifically the complex interpolation method, as introduced by Calderón. We say that: 1. the bilinear operator T is compact if it maps bounded sets to sets of compact closure. 2.\bar{ A} = (A_0,A_1) is a Banach couple if A_0,A_1 are Banach spaces that are continuously embedded in the same Hausdorff topological vector space. Moreover, if (Ω,\mathcal{A}, μ) is a σ-finite measure space, we say that: 3. E is a Banach function space if E is a Banach space of scalar-valued functions defined on Ω that are finite μ-a.e. and so that the norm of E is related to the measure μ in an appropriate way. 4. the Banach function space E has absolutely continuous norm if for any function f ∈ E and for any sequence (Γ_n)_{n=1}^{+∞}⊂ \mathcal{A} satisfying χ_{Γn} → 0 μ-a.e. we have that ∥f · χ_{Γ_n}∥_E → 0. Assume that \bar{A} and \bar{B} are Banach couples, \bar{E} is a couple of Banach function spaces on Ω, θ ∈ (0, 1) and E_0 has absolutely continuous norm. If the bilinear operator T : (A_0 ∩ A_1) × (B_0 ∩ B_1) → E_0 ∩ E_1 satisfies a certain boundedness assumption and T : \tilde{A_0} × \tilde{B_0} → E_0 compactly, we show that T may be uniquely extended to a compact bilinear operator T : [A_0,A_1]_θ × [B_0,B_1]_θ → [E_0,E_1]_θ where \tilde{A_j} denotes the closure of A_0 ∩ A_1 in A_j and [A_0,A_1]_θ denotes the complex interpolation space generated by \bar{A}. The proof of this result comes after we study the case where the couple of Banach function spaces is replaced by a single Banach space.
  • Litmanen, Jenna (2023)
    Tiivistelmä – Referat – Abstract Tässä työssä on tarkoituksena esittää Fukushiman hajotelma, jota voidaan käyttää yleistyksenä Itôn lemmalle. Ensimmäisessä luvussa käydään läpi perusteita stokastiselle analyysille. Työ etenee stokastisen analyysin perusteista Markovin prosesseihin ja tähän liittyviin käsitteisiin. Käydään läpi additiivisen funktionaalin käsite ja miten se liittyy käsiteltäviin prosesseihin. Martingaalien kohdalla käydään läpi peruskäsitteet. Tämän jälkeen siirrytään käsittelemään Itôn lemmaan ja tämän todistukseen. Itôn lemma on tärkeä työkalu taloustieteessä, etenkin kun työskennellään varallisuushintojen ja osakemarkkinoiden parissa. Itôn lemma luo pohja sille, kuinka varallisuushinnat voidaan määritellä Brownin liikkeen avulla. Samassa luvussa käsitellään myös muita hyödyllisiä stokastisen analyysin työkaluja. Yksi tällainen työkalu on Doobin-Meyer’n hajotelma martingaaleille ja ennustettavissa oleville prosesseille. Hajotelma on tärkeä työkalu, kun siirrytään korkeammalle tasolle stokastisten yhtälöiden kanssa. Ensimmäisen luvun lopussa käsitellään Sobolevin avaruutta, Dirichlet’n avaruutta ja Dirichlet’n muotoja. Näiden tarkoituksena on valmistaa lukijaa pohjatiedoiltaan seuraavaan lukuun, jossa käsitellään yhtä työn päälauseista Toisessa luvussa käsitellään additiivisen funktionaalin ja martingaaliadditiivisen funktionaalin energiaa ja Radon mittaa. Näiden käsittelyn jälkeen, siirrytään Itôn lemman yleistyksen pariin. Lopulta käsitellään yleistystä Itôn lemmalle. Yleistyksen pohjalla on mahdollisuus ottaa lauseesta “heikompi” versio, jolloin kaikkein vahvimpien ehtojen ja oletusten ei välttämättä tarvitse olla voimassa. Tämä on tärkeää, sillä Itôn lemman ehtona on jatkuvasti kahdesti differentioituvuus, joka ei läheskään aina toteudu stokastisissa prosesseissa. Näin ollen voidaan saavuttaa Itôn lemman edut kevyemmillä ehdoilla. Lopulta käsitellään Fukushiman hajotelmaa, joka on käytännöllinen prosesseille, jotka ovat semimartingaaleja. Fukushiman hajotelman avulla voidaan käsitellä tapauksia, joissa aiemmin käsiteltyjen lauseiden oletukset eivät täyty. Fukushiman hajotelma saadaan rakennettua aiemmin esitellyn lauseen avulla.
  • Williams Moreno Sánchez, Bernardo (2022)
    The focus of this work is to efficiently sample from a given target distribution using Monte Carlo Makov Chain (MCMC). This work presents No-U-Turn Sampler Lagrangian Monte Carlo with the Monge metric. It is an efficient MCMC sampler, with adaptive metric, fast computations and with no need to hand-tune the hyperparameters of the algorithm, since the parameters are automatically adapted by extending the No-U-Turn Sampler (NUTS) to Lagrangian Monte Carlo (LMC). This work begins by giving an introduction of differential geometry concepts. The Monge metric is then constructed step by step, carefully derived from the theory of differential geometry giving a formulation that is not restricted to LMC, instead, it is applicable to any problem where a Riemannian metric of the target function comes into play. The main idea of the metric is that it naturally encodes the geometric properties given by the manifold constructed from the graph of the function when embedded in higher dimensional Euclidean space. Hamiltonian Monte Carlo (HMC) and LMC are MCMC samplers that work on differential geometry manifolds. We introduce the LMC sampler as an alternative to Hamiltonian Monte Carlo (HMC). HMC assumes that the metric structure of the manifold encoded in the Riemannian metric to stay constant, whereas LMC allows the metric to vary dependent on position, thus, being able to sample from regions of the target distribution which are problematic to HMC. The choice of metric affects the running time of LMC, by including the Monge metric into LMC the algorithm becomes computationally faster. By generalizing the No-U-Turn Sampler to LMC, we build the NUTS-LMC algorithm. The resulting algorithm is able to estimate the hyperparameters automatically. The NUTS algorithm is constructed with a distance based stopping criterion, which can be replaced by another stopping criteria. Additionally, we run LMC-Monge and NUTS-LMC for a series of traditionally challenging target distributions comparing the results with HMC and NUTS-HMC. The main contribution of this work is the extension of NUTS to generalized NUTS, which is applicable to LMC. It is found that LMC with Monge explores regions of target distribution which HMC is unable to. Furthermore, generalized NUTS eliminates the need to choose the hyperparameters. NUTS-LMC makes the sampler ready to use for scientific applications since the only need is to specify a twice differentiable target function, thus, making it user friendly for someone who does not wish to know the theoretical and technical details beneath the sampler.
  • Kelomäki, Tuomas (2020)
    This thesis provides a proof and some applications for the famous result in topology called the Borsuk-Ulam theorem. The standard formulation of the Borsuk-Ulam theorem states that for every continuous map from an n-sphere to n-dimensional Euclidean space there are antipodal points that map on top of each other. Even though the claim is quite elementary, the Borsuk-Ulam theorem is surprisingly difficult to prove. There are many different kinds of proofs to the Borsuk-Ulam theorem and nowadays the standard method of proof uses heavy algebraic topology. In this thesis a more elementary, geometric proof is presented. Some fairly fundamental geometric objects are presented at the start. The basics of affine and convex sets, simplices and simplicial complexes are introduced. After that we construct a specific simplicial complex and present a method, iterated barycentric subdivision, to make it finer. In addition to simplicial complexes, the theory we are using revolves around general positioning and perturbations. Both of these subjects are covered briefly. A major part in our proof of the Borsuk-Ulam theorem is to show that a certain homotopy function F from a specific n + 1-manifold to the n-dimensional Euclidean space can be by approximated another map G. Moreover this approximation can be done in a way so that the kernel of G is a symmetric 1-manifold. The foundation for approximating F is laid with iterated barycentric subdivision. The approximation function G is obtained by perturbating F on the vertices of the simplicial complex and by extending it locally affinely. The perturbation is done in a way so that the image of vertices is in a general position. After proving the Borsuk-Ulam theorem, we present a few applications of it. These examples show quite nicely how versatile the Borsuk-Ulam theorem is. We prove two formulations of the Ham Sandwich theorem. We also deduce the Lusternik-Schnirelmann theorem from the Borsuk- Ulam theorem and with that we calculate the chromatic numbers of the Kneser graphs. The final application we prove is the Topological Radon theorem.
  • Bazaliy, Viacheslav (2019)
    This thesis provides an analysis of Growth Optimal Portfolio (GOP) in discrete time. Growth Optimal Portfolio is a portfolio optimization method that aims to maximize expected long-term growth. One of the main properties of GOP is that, as time horizon increases, it outperforms all other trading strategies almost surely. Therefore, when compared with the other common methods of portfolio construction, GOP performs well in the long-term but might provide riskier allocations in the short-term. The first half of the thesis considers GOP from a theoretical perspective. Connections to the other concepts (numeraire portfolio, arbitrage freedom) are examined and derivations of optimal properties are given. Several examples where GOP has explicit solutions are provided and sufficiency and necessity conditions for growth optimality are derived. Yet, the main focus of this thesis is on the practical aspects of GOP construction. The iterative algorithm for finding GOP weights in the case of independently log-normally distributed growth rates of underlying assets is proposed. Following that, the algorithm is extended to the case with non-diagonal covariance structure and the case with the presence of a risk-free asset on the market. Finally, it is shown how GOP can be implemented as a trading strategy on the market when underlying assets are modelled by ARMA or VAR models. The simulations with assets from the real market are provided for the time period 2014-2019. Overall, a practical step-by-step procedure for constructing GOP strategies with data from the real market is developed. Given the simplicity of the procedure and appealing properties of GOP, it can be used in practice as well as other common models such as Markowitz or Black-Litterman model for constructing portfolios.
  • Hellsten, Kirsi (2023)
    Triglycerides are a type of lipid that enters our body with fatty food. High triglyceride levels are often caused by an unhealthy diet, poor lifestyle, poorly treated diseases such as diabetes and too little exercise. Other risk factors found in various studies are HIV, menopause, inherited lipid metabolism disorder and South Asian ancestry. Complications of high triglycerides include pancreatitis, carotid artery disease, coronary artery disease, metabolic syndrome, peripheral artery disease, and strokes. Migration has made Singapore diverse, and it contains several subpopulations. One third of the population has genetic ancestry in China. The second largest group has genetic ancestry in Malaysia, and the third largest has genetic ancestry in India. Even though Singapore has one of the highest life expectancies in the world, unhealthy lifestyles such as poor diet, lack of exercise and smoking are still visible in everyday life. The purpose of this thesis was to introduce GWAS-analysis for quantitative traits and apply it to real data, and also to see if there are associations between some variants and triglycerides in three main subpopulations in Singapore and compare the results to previous studies. The research questions that this thesis answered are: what is GWAS analysis and what is it used for, how can GWAS be applied to data containing quantitative traits, and is there associations between some SNPs and triglycerides in three main populations in Singapore. GWAS stands for genome-wide association studies designed to identify statistical association between genetic variants and phenotypes or traits. One reason for developing GWAS was to learn to identify different genetic factors which have an impact on significant phenotypes, for instance susceptibility to certain diseases Such information can eventually be used to predict the phenotypes of individuals. GWAS have been globally used in, for example, anthropology, biomedicine, biotechnology, and forensics. The studies enhance the understanding of human evolution and natural selection and helps forward many areas of biology. The study used several quality control methods, linear models, and Bayesian inference to study associations. The research results were examined, among other things, with the help of various visual methods. The dataset used in this thesis was an open data used by Saw, W., Tantoso, E., Begum, H. et al. in their previous study. This study showed that there are associations between 6 different variants and triglycerides in the three main subpopulations in Singapore. The study results were compared with the results of two previous studies, which differed from the results of this study, suggesting that the results are significant. In addition, the thesis reviewed the ethics of GWAS and the limitations and benefits of GWAS. Most of the studies like this have been done in Europe, so more research is needed in different parts of the world. This research can also be continued with different methods and variables.
  • Huynh, Inchuen (2023)
    Hawkes processes are a special class of inhomogenous Poisson processes used to model events exhibiting interdependencies. Initially introduced in Hawkes [1971], Hawkes processes have since found applications in various fields such as seismology, finance, and criminology. The defining feature of Hawkes processes lies in their ability to capture self-exciting behaviour, where the occurrence of an event increases the risk of experiencing subsequent events. This effect is quantified in their conditional intensity function which takes into account the history of the process in the kernel. This thesis focuses on the modeling of event histories using Hawkes processes. We define both the univariate and multivariate forms of Hawkes processes and discuss the selection of kernels, which determine whether the process is a jump or a non-jump process. In a jump Hawkes process, the conditional intensity spikes at an occurrence of an event and the risk of experiencing new events is the highest immediately after an event. For non-jump processes, the risk increases more gradually and can be more flexible. Additionally, we explore the choice of baseline intensity and the inclusion of covariates in the conditional intensity of the process. For parameter estimation, we derive the log-likelihood functions and discuss goodness of fit methods. We show that by employing the time-rescaling theorem to transform event times, assessing the fit of a Hawkes process reduces to that of an unit rate Poisson process. Finally, we illustrate the application of Hawkes processes by exploring whether an exponential Hawkes process can be used to model occurrences of diabetes-related comorbidities using data from the Diabetes Register of the Finnish Institute for Health and Welfare (THL). Based on our analysis, the process did not adequately describe our data, however, exploring alternative kernel functions and incorporating time-varying baseline intensities hold potential for improving the compatibility.
  • Schauman, Julia (2022)
    In this thesis, we explore financial risk measures in the context of heavy-tailed distributions. Heavy-tailed distributions and the different classes of heavy-tailed distributions will be defined mathematically in this thesis but in more general terms, heavy-tailed distributions are distributions that have a tail or tails that are heavier than the exponential distribution. In other words, distributions which have tails that go to zero more slowly than the exponential distribution. Heavy-tailed distributions are much more common than we tend to think and can be observed in everyday situations. Most extreme events, such as large natural phenomena like large floods, are good examples of heavy-tailed phenomena. Nevertheless, we often expect that most phenomena surrounding us are normally distributed. This probably arises from the beauty and effortlessness of the central limit theorem which explains why we can find the normal distribution all around us within natural phenomena. The normal distribution is a light-tailed distribution and essentially it assigns less probability to the extreme events than a heavy-tailed distribution. When we don’t understand heavy tails, we underestimate the probability of extreme events such as large earthquakes, catastrophic financial losses or major insurance claims. Understanding heavy-tailed distributions also plays a key role when measuring financial risks. In finance, risk measuring is important for all market participants and using correct assumptions on the distribution of the phenomena in question ensures good results and appropriate risk management. Value-at-Risk (VaR) and the expected shortfall (ES) are two of the best-known financial risk measures and the focus of this thesis. Both measures deal with the distribution and more specifically the tail of the loss distribution. Value-at-Risk aims at measuring the risk of a loss whereas ES describes the size of a loss exceeding the VaR. Since both risk measures are focused on the tail of the distribution, mistaking a heavy-tailed phenomena for a light-tailed one can lead to drastically wrong conclusions. The mean excess function is an important mathematical concept closely tied to VaR and ES as the expected shortfall is mathematically a mean excess function. When examining the mean excess function in the context of heavy-tails, it presents very interesting features and plays a key role in identifying heavy-tails. This thesis aims at answering the questions of what heavy-tailed distributions are and why are they are so important, especially in the context of risk management and financial risk measures. Chapter 2 of this thesis provides some key definitions for the reader. In Chapter 3, the different classes of heavy-tailed distributions are defined and described. In Chapter 4, the mean excess function and the closely related hazard rate function are presented. In Chapter 5, risk measures are discussed on a general level and Value-at-Risk and expected shortfall are presented. Moreover, the presence of heavy tails in the context of risk measures is explored. Finally, in Chapter 6, simulations on the topics presented in previous chapters are shown to shed a more practical light on the presentation of the previous chapters.
  • Hanninen, Elsa (2020)
    Vakuutussopimusten tappion arvioiminen on tärkeää vakuutusyhtiön riskienhallinnan kannalta. Tässä työssä esitellään Hattendorffin lause vakuutussopimuksen tappion odotusarvon ja varianssin arvioimiseksi sekä sovelletaan sen tuloksia monitilaisella Markov-prosessilla mallinnettavalle henkivakuutussopimukselle. Hattendorffin lauseen nojalla ekvivalenssiperiaatteen mukaan hinnoitellun vakuutussopimuksen erillisillä aikaväleillä syntyneiden tappioiden odotusarvo on nolla, ja tappiot ovat korreloimattomia, jonka seurauksena tappion varianssi voidaan laskea erillisillä aikaväleillä muodostuneiden tappioiden varianssien summana. Työn soveltavana osana simuloidaan Markov-prosesseja sopivassa monitilaisessa mallissa mallintamaan henkivakuutussopimuksien realisaatioita. Tutkitaan, onko simuloitujen polkujen tuottamien vuosittaisten tappioiden keskiarvo lähellä nollaa, ja onko koko sopimusajan tappioiden varianssin arvo lähellä summaa vuosittaisten tappioiden variansseista. Lisäksi lasketaan simulaation asetelmalle Hattendorffin lauseen avulla teoreettiset vastineet ja verrataan näitä simuloituihin arvoihin. Vakuutussopimus pitää karkeasti sisällään kahdenlaisia maksuja: vakuutusyhtiön maksamat korvausmaksut ja vakuutetun maksamat vakuutusmaksut. Vakuutussopimuksen kassavirta on jollain aikavälillä tapahtuvien vakuutuskorvausten ja -maksujen erotuksen hetkeen nolla diskontattu arvo. Vastuuvelka on määrittelyhetken jälkeen syntyvän, määrittelyhetkeen diskontatun, kassavirran odotusarvo. Vakuutussopimuksen tappio jollain aikavälillä määritellään kyseisen aikavälin kassavirran ja vastuuvelan arvonmuutoksen summana. Kun määritellään stokastinen prosessi, joka laskee tietyllä hetkellä siihen mennessä kumuloituneet kustannukset sekä tulevan vastuuvelan nykyarvon, voidaan tappio ilmaista tämän prosessin arvonmuutoksena. Kyseinen prosessi on neliöintegroituva martingaali, jolloin Hattendorffin lauseen tulokset ovat seurausta neliöintegroituvien martingaalien arvonmuutoksen ominaisuuksista. Hattendorffin lauseen tulokset löydettiin jo 1860-luvulla, mutta martingaaliteorian hyödyntäminen on moderni lähestymistapa ongelmaan. Esittämällä monitilaisella Markov-prosessilla mallinnettavan sopimuksen kustannukset Lebesgue-Stieltjes integraalina, saadaan tappion varianssille laskukelpoiset muodot. Markov-prosessilla mallinnettavilla sopimuksille voidaan johtaa erityistapaus Hattendorffin tuloksesta, missä tappiot voidaan allokoida eri vuosien lisäksi eri tiloihin liittyviksi tappioiksi. Soveltavassa osiossa nähdään, että yksittäisinä sopimusvuosina syntyneiden tappioiden odotusarvot ovat lähellä nollaa, ja otosvarianssien summa lähestyy koko sopimusajan tappion otosvarianssia, mikä on yhtäpitävää Hattendorffin lauseen väitteiden kanssa. Simuloidut otosvarianssit eivät täysin vastaa teoreettisia vastineitaan.
  • Holopainen, Jonathan (2021)
    Perinteisesti henkivakuutusten hinnoittelutekijöihin lisätään turvamarginaali. Diskonttauskorko on markkinakorkoa matalampi ja kuolevuuteen on lisätty turvamarginaali. Kuolemanvaraturvassa hinnoittelukuolevuus on korkeampi ja annuiteettivakuutuksessa(eläkevakuutus) matalampi kuin havaittu kuolevuus. Koska henkivakuutukset ovat usein pitkäkestoisia, on turvaavuudella hyvin tärkeä rooli tuotteen kannattavuuden ja henkivakuutusyhtiön vakavaraisuuden kannalta. Monesti myös laki määrää henkivakuutusyhtiöt hinnoittelemaan tuotteensa turvaavasti jotta yhtiöt voivat huonossakin tilanteessa edelleen turvata etuudet vakuutuksenottajille. Henkivakuutusyhtiöt ovat myös kehittäneet monimutkaisempia tuotteita, jossa voi olla useampia riskitekijöitä, joiden kehittymistä pitkällä aikavälillä voi olla vaikea ennustaa. Turvaavat hinnoittelutekijät tarkoittavat, että keskimäärin vakuutusyhtiöille kertyy tuottoja yli ajan. Tässä työssä tutkitaan vakuutusyhtiöön kertyvän tuoton tai tappion satunnaisuuden ominaisuuksia. Jätämme tämän työn ulkopuolelle vakuutusyhtiön sijoitustuoton, liikekulut sekä vakuutusyhtiöiden tavat jakaa ylijäämää vakuutetuille bonuksina. Työssä seurataan Henrik Ramlau-Hansenin artikkelia 'The emergence of profit in life insurance' keskittyen kuitenkin yleiseen tuoton odotusarvoon, odotusarvoon liittyen tiettyyn tilaan sekä määritetyn ajan sisällä kertyneeseen tuoton odotusarvoon. Tuloksia pyritään myös avaamaan niin, että ne olisi helpompi ymmärtää. Henkivakuutusyhtiön tuotto määritellään matemaattisesti käyttäen Markov prosesseja. Määritelmää käyttäen lasketaan tietyn aikavälin kumulatiivisen tuoton odotusarvo ja hajonta. Tulokseksi saadaan, että tuoton odotusarvo on Markov prosessin eri tilojen tuottaman ensimmäisen kertaluvun prospektiivisen vastuuvelan ja toisen kertaluvun retrospektiivisen vastuuvelan erotuksien summa kerrottuna todennäköisyyksillä, joilla ollaan kyseisessä tilassa aikavälin lopussa. Lopuksi työssä lasketaan vielä 10 vuoden kertamaksuisen kuolemanvaravakuutuksen odotettu tuotto käyttäen työn tuloksia. Lisäksi sama vakuutus simuloitiin myös 10 000 000 kertaa päästen hyvin lähelle kaavan antamaa lopputulosta.