Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Title

Sort by: Order: Results:

  • Kujala, Tuomas (2012)
    Tutkielmassa tutustutaan Markov-prosesseihin sekä niiden soveltamiseen vakuutusten hinnoittelussa. Sovellusvaiheessa tutkitaan erityisesti kolmitilaista työkyvyttömyysmallia sekä työkyvyttömyysvakuutuksen nettokertamaksua. Tutkielman aluksi on tarpeellista määritellä Markov-prosessi sekä siihen liittyvät siirtymäintensiteetit. Näiden avulla voimme tutkia tarkemmin Markov-prosessin ominaisuuksia sekä esitellä ja määritellä sovellusten kannalta tärkeitä ominaisuuksia kuten Kolmogorovin differentiaaliyhtälöt sekä polkutodennäköisyydet. Teoriaosuuden tavoitteena on konstruoida diskreettiaikaisen Markovketjun ja hyppyhetkien avulla stokastinen prosessi, joka osoitetaan lopulta olevan jatkuva-aikainen Markov-prosessi. Markov-prosesseille löytyy useita käyttömahdollisuuksia, mutta tässä tutkielmassa keskitytään vakuutusalaan ja sovelletaan niitä vakuutusten hinnoittelussa. Markov-prosessien soveltamisessa on kuitenkin oltava aina hyvin kriittinen, sillä Markov-prosesseille ainoastaan nykytila on relevantti ennustettaessa tulevaisuuden tilaa. Historian 'unohtaminen' vaikuttaa väistämättä saatuihin tuloksiin. Pohditaan sitä kuinka realistisia ennusteita saadaan ja kuinka niitä käyttää hyödyksi. Sovellusosiossa tutustutaan työkyvyttömyyseläkevakuutuksen hinnoitteluun nettokertamaksun avulla. Määritetään kolmitilainen markovilainen malli, jonka tiloina ovat 'työkykyinen', 'työkyvytön' ja 'kuollut'. Oletetaan, että vakuutettu on oikeutettu jatkuva-aikaiseen yhden rahayksikön suuruiseen korvaukseen joutuessaan työkyvyttömäksi. Määritellään tämän mallin avulla Kolmogorovin di erentiaaliyhtälöitä hyväksi käyttäen vakuutuksesta aiheutunut nettokertamaksu kahdella tavalla, vakioisten ja ei-vakioisten siirtymäintensiteettien avulla. Tutkitaan nettokertamaksun suuruutta eri pituisille vakuutussopimuksille. Simuloidaan vielä lopuksi nettokertamaksu vakioisten ja ei-vakioisten siirtymäintensiteettien tapauksessa. Tarkastellaan simuloitujen arvojen ja laskettujen tarkkojen arvojen erotuksen suhdetta tarkkoihin arvoihin. Tutkielmassa on tarpeellista esitellä lyhyesti myös simuloimiseen tarvittava teoria sekä simulointialgoritmi, jota on käytetty Markov-prosessien simuloimisessa. Simulointi on toteutettu Matlab-ohjelman avulla. Tutkielman lukemista helpottaa todennäköisyysteorian ja henkivakuutusmatematiikan perusteiden tuntemus. Toisaalta lukemista on yritetty helpottaa muutamilla merkinnöillä sekä havainnollistavilla kuvilla.
  • Lehtonen, Anniina (2014)
    Helsingin seutu on yksi Euroopan nopeimmin kasvavista kaupunkiseuduista. Sen kasvu ja kehitys on hyvin ajankohtainen ja tärkeä asia niin pääkaupunkiseudun, koko Suomen, kuin kansainvälisessä mittakaavassa. Helsingin seudun kehitystä on pyritty tukemaan metropolipolitiikalla, jonka tarkoituksena on vahvistaa kansainvälistä kilpailukykyä ja tasapainoista kehitystä. Metropolialueen kokonaisvaltaisen kehityksen nähdään olevan merkityksellistä koko maan kansantaloudelle. Espoo sijaitsee pääkaupunkiseudun metropolialueen ytimessä, jolloin sen kasvulla ja kehityksellä on olennainen merkitys metropolialueen kokonaisuudessa. Helsingin seudun kaupunkirakenteen voidaan katsoa olevan vahvasti hajautunutta ja harvaan asutettua. Tutkielmani kannalta merkittävänä teemana esiin nouseekin monessa yhteydessä todettu tarve yhdyskuntarakenteen eheyttämisestä sekä asukkaiden paikallistuntemuksen hyödyntämisestä osana tavoitteiden saavuttamista. Vuonna 2000 voimaan tulleessa uudessa maankäyttö- ja rakennuslaissa pyritään tukemaan vuorovaikutteista suunnittelua, jossa 'osallisilla' tulee olla oikeus ja mahdollisuus osallistua suunnitteluun. Erityisesti paikkatietojärjestelmien hyödyntämisen mahdollisuudet osana osallistuvaa ja vuorovaikutteista suunnittelua sekä päätöksentekoa on herättänyt kasvavaa mielenkiintoa niin suunnittelu- kuin tutkimuskentällä. Paikkatietoa hyödyntävien menetelmien myötä esimerkiksi asukkaiden tuottama kokemuksellinen tieto on voitu yhdistää sijaintitietoon, jolloin se on helpommin hallittavissa ja analysoitavissa. Tämä mahdollistaa myös asukkaiden paremman osallistumisen suunnitteluun ja elinympäristönsä arvioimiseen. Tutkimuksen taustalla on Espoon kaupunkisuunnittelukeskuksen halu kehittää kommunikaatiota asukkaiden ja suunnittelijoiden välillä sekä löytää uusia toimivia vaikuttamisen väyliä. Espoon kaupunkisuunnittelukeskus osallistuu valtiovarainministeriön koordinoimaan Sähköisen asioinnin ja demokratian vauhdittamisohjelmaan (SADe-ohjelma), jonka osana on teetetty Dimenteq Oy:n tuottaman Harava-palvelun karttapohjainen verkkokysely espoolaisesta asuinympäristöstä. Tutkielmassa pyritään tuomaan kyselyn pohjalta esiin asukkaiden mielipiteitä onnistuneista ja epäonnistuneista asuinalueita, muuttamisen toivealueista sekä uuden asuntorakentamisen mahdollisesta sijainnista Espoossa. Tuloksia vertaillaan myös aiempaan asumispreferenssitutkimukseen, ja erityisesti vuonna 2000 teetettyyn kyselytutkimukseen jossa espoolaisten asumista ja asumistoiveita selvitettiin edellisen kerran. Tavoitteena on selvittää, minkälaisena hyvä asuinympäristö näyttäytyy tällä hetkellä espoolaisten näkökulmasta. Karttapohjaisten merkintöjen avulla on pyritty löytämään alueita, jonne mielipiteet kasautuvat, sekä kuvaamaan karttamerkintöjä niihin liittyvien täydentävien vastausvaihtoehtojen avulla. Aineiston analysoinnissa on käytetty MapInfo paikkatieto-ohjelmistoa ja Excel taulukkolaskentaohjelmaa. Yllättävää aineistossa oli vastaajien keskuudessa vallitseva yhtenäiskulttuuri niin kokemuksiin kuin toiveisiin liittyen, mitään selkeitä asukasprofiileja ei aineistosta voitu nostaa esiin. Tutkielman tuloksissa nousee kuitenkin selkeästi esiin samat teemat kuin aiemmassakin tutkimuksessa, eli luonnonläheisyyden sekä hyvien palveluiden ja liikenneyhteyksien tärkeys asukkaille. Toisaalta pientaloasumisen sijaan korostuu erityisesti kaupunkimaisen rakentamisen arvostus sekä asuinalueiden imagon ja ulkonäön merkitys.
  • Pihko, Jekaterina (2017)
    Hyvinkäänkylä pumping station extracts groundwater from a local esker aquifer and supplies drinking water within the Hyvinkää municipality. There have been problems with the aquifer’s water quality, when surface water from the Vantaa river has mixed with groundwater during flooding season. As a result of the mixing, the pumping at the station must be periodically stopped. For more effective groundwater acquisition, management and protection it is critical to gain better understanding of the structure of the aquifer better. In addition, more knowledge of the groundwater-surface water interaction is needed and information on the possible routes of groundwater flow, particularly those which most affect the groundwater supply coming to the pumping station. The purpose of this study is to gather the previous research on the area and to create a 3D hydrogeological structural model using the Leapfrog Geo program. The model in this study visualizes the hydrogeological structures and serves as input for a groundwater flow model. The data in this study can be categorized into three groups: the structure of the bedrock and sediments, the level of the groundwater and the groundwater discharge. The structure of the bedrock and sediments was surveyed by geophysical methods, using previous data as well as some additional data gathered by field measurements. Data on the groundwater level measurements were obtained from both the Finnish Environment Institute and the Hyvinkää Water station. Additional data were collected using field measurements. To measure the amount of groundwater discharge, data were collected using flow measurements in the Vantaa River and the results compared with previous research. The geological and geophysical data were compiled and georeferenced first in ArcGis and then transferred into Leapfrog, which was used to build the 3D hydrogeological structural model. On the basis of the geological units of the drill data, five hydrogeological units were formed: coarse glaciofluvial material, fine glaciofluvial material, fine grained material, till and other. The hydraulic conductivity of the drill core sediment samples were calculated, and then used to estimate the hydraulic conductivity within and between the different sediment layers. Since one purpose for the 3D structural model was to serve as a base for the flow model, it was simplified into a 2-layer model. The study area was divided into smaller subareas, which were visualized with cross sections that were sliced from the 3D model. The measured groundwater levels were interpolated to demonstrate the groundwater flow direction in the study area. The groundwater monitoring levels were examined over a five year period and compared with weather and groundwater pumping data. The flow measurements obtained from the Vantaa River were compared to previous research to estimate the amount of groundwater discharge into the Vantaa River. The contribution of this study is five new significant observations about the structure of the aquifer: 1) the north and south part of the aquifer are connected to each other 2) the three-dimensional shape of the esker is different from its geomorphologic shape 3) most of the groundwater flowing to the pumping station is from the south side of the river 4) the amount of groundwater flowing to the pumping station is very high compared to the surface area of the aquifer 5) thick glaciofluvial layers underneath the Hirvisuo bog allow groundwater flow from the first Salpausselkä to the esker.
  • Lintunen, Jenni (2020)
    Most of the happiness and well-being surveys are showing that Finland is one of the happiest countries in the world. Many of the Finnish people can’t relate to these results. Neither the mental health statistics nor suicide rates in Finland are speaking for the well-being of Finnish people. The reason is that the well-being is traditionally measured by objective indicators that include only economic and social factors. The measurements are not measuring subjective well-being despite that it’s essential part for the demographic development and population health. The study investigates how the well-being measures correspond with subjective well-being in Finland years 2002-2016. The research also examines how the well-being measurements are able to measure subjective well-being of different socio-economic groups. The study is a quantitative comparison between descriptive statistics of objective well-being measures and life satisfaction. The examined wellbeing measures are HDI, HPI, SSI, Gini co-efficient, ISEW, GPI and GDP. The life satisfaction attribute data is from European Social Survey. The results of the study show that the examined well-being measures don’t correspond with the level of subjective well-being in Finland. HDI and SSI are higher than life satisfaction. HPI and Gini co-efficient are lower than life satisfaction. Only Gini co-efficient corresponds with the level of life satisfaction experienced from unemployed. Nevertheless the results show that there is some corresponding with the fluctuation of some of the well-being measures and subjective well-being. Gini co-efficient and SSI are as stable as average life satisfaction in Finland and life satisfaction experienced from students and pensioners. HDI has similar slight growth with life satisfaction experienced from laborers. HPI and GPI are more declining compared to the life satisfactions. GPI has similar fluctuation with life satisfaction experienced from unemployed. ISEW and GDP are showing significantly more increase and fluctuation compared to life satisfaction. Subjective and objective well-being are often seen as two separate dimensions. The fact is that overall well-being is formed from both of the dimensions. That is why well-being should be measured with both, subjective and objective measurements.
  • Rasi, Eeva (2014)
    Terveystaltio kokoaa yhteen kansalaisen terveyteen liittyvän tiedon ja työkalut Henkilökohtainen terveyden seuranta keskittyy yksittäisen ihmisen terveyteen ja hyvinvointiin liittyvien muuttujien seurantaan. Henkilökohtainen terveyden seuranta on nopeasti kasvava sovellusalue, joka kattaa laajan kentän erilaisia hyvinvointi-, terveys- ja viestintälaitteita sekä niiden päälle rakennettuja sovelluksia. Sovellusten ja laitteiden avulla voidaan kerätä monipuolista tietoa, jota voidaan hyödyntää henkilökohtaisen terveyden ja hyvinvoinnin seurannassa. Terveydenhuoltoalan standardit pohjautuvat pitkälti potilastietojärjestelmien tarpeisiin eivätkä näin ollen sovellu suoraan terveystaltion käyttöön ja hyvinvointitietojen esittämiseen. Mittalaitetiedon osalta Continua Health Alliancen standardit ovat laajasti käytössä ja niiden on todettu tarjoavan hyvät tekniset välineet mittalaitetiedon siirtämiseen terveystaltioon. Hyvinvointitietojen integroimiseksi osaksi terveystaltiota voidaan tunnistaa kaksi erilaista lähestymistapaa. XML-pohjaiset ratkaisut ovat joustavia ja helposti muunneltavia. Ontologiapohjaisen lähestymistavan pohjan muodostavat semanttisen webin teknologiat. Ontologiapohjaisen ratkaisun avulla voidaan saavuttaa tietojärjestelmien välinen semanttinen yhteensopivuus, jolloin järjestelmät ymmärtävät välittämänsä ja vastaanottamansa tiedon merkityksen. Uuden sukupolven standardit sekä nopeasti kehittyvä mittalaiteteknologia tuovat uusia mahdollisuuksia hyvinvointitietojen integroimiseksi osaksi terveystaltiota. Mittalaiteteknologian kehittyminen tuo kansalaisten saataville sellaista teknologiaa, joka on aikaisemmin ollut käytettävissä ainoastaan tutkimus- ja ammattikäytössä.
  • Hauta-aho, Eetu (2017)
    Themes of change, complexity and globalization have established themselves firmly in modern discourse. Among other discources these produce a picture of the state of the world and society. Changes in discourse do not act out randomly. They are a part of deliberate control. Governmentality in modern societies consists of governmental rationalities, technologies and subjectification. The thesis focuses on the role of national curricula in subject production. School is a crucial part of governing the population. Focusing on how a curriculum depicts an ideal citizen reveals current trends in society. Among the ideal subject in a curriculum the thesis focuses on factors impacting what the curriculum forms to be like. The thesis studies Finnish elementary school curriculum from 2016 that sets the basis for elementary school teaching in Finland. By means of content analysis the thesis does define the most influential themes in the curriculum. Basing on these themes forms a picture of ideal subject found in the curriculum. Basing on literature the thesis also studies themes working on the background of the curriculum. The ideal subject found from the curriculum is defined to be an economic subject that manages oneself basing on economical rationality that the neoliberal rationality defines. Demand for certain ideal subject forms discursively through problematization and determining solutions for problems. The current situation and problems are defined, and basing on that definition is justified what should be done. Through problematization the structure of society is modified. Model of an ideal individual does also change in this context. Economic individual establishes oneself as a part of an ensemble where economical globalization defines the direction of societal development. The subject has to internalize global orientation, enhance her human capital and manage herself with entrepreneurial spirit. In the world that is defined by globalization, an ideal subject is part of topological power. The purpose of individuals, nations and global economy become one and the same. This happens when the purpose of nations is defined as competing with each other. An ideal citizen is defined based on nation's economic ambition.
  • Karusto, Nina (2017)
    The effect of increasing SSTs and roughness on the meridional moisture flux (MMF) and precipitationn of extratropical cyclones are studied with idealized baroclinic wave simulations. The main objective is to quantify MMF, precipitation and several other factors of extratropical cyclones as the SSTs and roughness length are increased. The simulations are done in idealized conditions with Weather Research and Forecasting (WRF) model where the surface is only water. The sensitivity studies are conducted by changing the sea surface temperature (SST) homogeneously throughout the domain and changing the surface roughness length which depends on the wind speed. Increasing SSTs caused MMF and precipitation to increase, as well as, e. g. latent heat flux (LHF) increased and minimum surface pressure decreased. On the other hand, increasing roughness length caused MMF and precipitation to decrease, additionally, LHF decreased and minimum surface pressure increased. The largest effect of SSTs and roughness had on the LHF and convective precipitation and the smallest on the minimum surface pressure.
  • Niinivaara, Olli (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2002)
    Tutkielmassa esitellään idean kontekstin kuvaaminen keinona tehostaa ideoiden välittymistä. Kontekstitieto kuvataan dokumentteihin liittyvänä metatietona, jota hallitaan dokumenteista riippumattomissa metatietokannoissa. Päämääränä pidetään sellaista idean kontekstin kuvausta, joka on riittävän ilmaisuvoimainen, mutta jonka luominen ei aseta järjestelmän käyttäjille ylivoimaista työtaakkaa. Tiedon välittyminen nähdään prosessina, johon perustuen idean konteksti jaetaan tuottokontekstiin, julkaisukontekstiin ja käyttökontekstiin. Tähän jakoon perustuen käsitellään metatiedon muodostaminen ja sisältö yksityiskohtaisesti yksittäisten metatietotietueen attribuuttien tasolla. Kontekstitiedon käyttökohteista tarkastellaan kontekstin visualisointia informaation visualisoinnin tekniikoihin perustuen, idean arvon mittaamista bibliometrisiä menetelmiä kehittämällä ja automaattista ideoiden valintaa tiedon suodatuksen menetelmien ja digitaalisten assistenttien avulla.
  • Salmirinne, Simo (2020)
    Time series are essential in various domains and applications. Especially in retail business forecasting demand is a crucial task in order to make the appropriate business decisions. In this thesis we focus on a problem that can be characterized as a sub-problem in the field of demand forecasting: we attempt to form clusters of products that reflect the products’ annual seasonality patterns. We believe that these clusters would aid us in building more accurate forecast models. The seasonality patterns are identified from weekly sales time series, which in many cases are very sparse and noisy. In order to successfully identify the seasonality patterns from all the other factors contributing in a product’s sales, we build a pipeline to preprocess the data accordingly. This pipeline consist of first aggregating the sales of individual products over several stores to strengthen the sales signal, followed by solving a regularized weighted least squares objective to smooth the aggregates. Finally, the seasonality patterns are extracted using the STL decomposition procedure. These seasonality patterns are then used as input for the k-means algorithm and several hierarchical agglomerative clustering algorithms. We evaluate the clusters using two distinct approaches. In the first approach we manually label a subset of the data. These labeled subsets are then compared against the clusters provided by the clustering algorithms. In the second approach we form a simple forecast model that fits the clusters’ seasonality patterns back to the observed sales time series of individual products. In this approach we also build a secondary validation forecast model with the same objective, but instead of using the clusters provided by the algorithms, we use predetermined product categories as the clusters. These product categories should naturally provide a valid baseline for groups of products with similar seasonality as they reflect the structure of how similar products are organized within close proximity in physical stores. Our results indicate that we were able to find clear seasonal structure in the clusters. Especially the k-means algorithm and hierarchical agglomerative clustering algorithms with complete linkage and Ward’s method were able to form reasonable clusters, whereas hierarchical agglomerative clustering algorithm with single linkage was proven to be unsuitable given our data.
  • Pöntinen, Mikko (2018)
    One of the main factors currently limiting geophysical and geological studies of asteroids is the lack of visual and near-infrared (Vis-NIR) spectra. European Space Agency’s upcoming Euclid mission will observe up to 150,000 asteroids and gather a large amount of spectral data of them in the Vis-NIR wavelength range. Asteroids will appear as faint streaks in the images. In order to exploit the spectra, the asteroids have to first be found in the massive amounts of data to be obtained by Euclid. In this work we tested two methods for detecting asteroid streaks in simulated Euclid images. The first method is StreakDet, a software originally developed to detect streaks caused by space debris. We optimized the parameters of StreakDet, and developed a comprehensive analysis software that can visualize and give statistics of the StreakDet results. StreakDet was tested by feeding 4096×4136 pixel images to the software, which then returned the coordinates of the asteroids found. The second method is machine learning. We programmed a deep neural network, which was then trained to distinguish between asteroid images and non-asteroid images. Smaller images were used for this binary classification task, but we also developed a sliding window method for analyzing larger images with the neural network. After optimizing the program parameters, StreakDet was able to detect approximately 60% of asteroids with apparent magnitude V < 22.5. StreakDet worked better for long streaks, up to 125 pixels (corresponding to an asteroid with a sky motion of 80 "/h) while streaks shorter than 15 pixels (10 "/h) were typically not found. The neural network was able to classify the brightest (20 < V < 21) streaks with up to 98% accuracy when using very small images. When analyzing larger images, the sliding window algorithm produced heat maps as output, from which the asteroids could easily be spotted. The machine learning algorithm utilized was fairly simple, so even better results may be obtained with more advanced algorithms.
  • Toukola, Peppi (2021)
    In this thesis the suitability of Nuclear Magnetic Resonance (NMR) spectroscopy in the identification of rubbers in museum collections is discussed through a literature review and experimental work where samples from the rubber collection of Tampere Museums were analysed with different NMR techniques. The literature part of this thesis focuses on recent (2011-2020) scientific publications on analytical instrumental techniques used in the identification of cultural heritage plastics. Vibrational spectroscopy methods utilizing hand-held or portable devices have been the most prominent methods used in characterization of historical plastics materials. Bench-top devices and analytical techniques requiring sampling were used to acquire more detailed analysis results. However, NMR spectroscopy was not used as the main analysis technique in the reviewed publications. In the experimental part altogether 21 rubber object samples and 8 reference samples were identified using 1D and 2D NMR techniques in solution state. Three samples were additionally analysed with solid-state High Resolution Magic Angle Spinning (HRMAS) NMR spectroscopy. The chemical structures of the samples were confirmed with these methods. To further explore fast and more automated identification of the rubber samples a statistical classification model utilizing acquired solution-state 1H NMR data was developed. Three rubber types were chosen for the analysis. The model was created using analysis data from the museum object samples and validated using the reference sample data. Identification rate of 100 % was achieved.
  • Varvarà, Giulia (2022)
    Species factories are defined as times and places in the fossil record where and when an exceptionally large number of new species occurs. While several tailored solutions for the mammalian record have been proposed, how to identify species factories computationally in a standardized way is still an open question. To quantify what is exceptional, we first need to quantify what is regular. One of the main challenges in this identification process is to account for sampling unevenness, which depends on several methodological decisions, including the scale of the analysis (aggrega- tion radius). In this thesis we used Capture-Mark-Recapture methods (CMR) with spatial aggregation guided by network modelling, to estimate the sampling probabilities for the species in the NOW database of mammalian fossil occurrences. Since the mammalian record is sparse and most localities include only a few species, we coupled CMR with tailored spatial aggregation approaches to estimate the sampling prob- abilities. We then used these sampling probabilities to quantify background speciation rates and assess what rates are abnormal. We represented aggregated fossil data as a bipartite network and used community detection to evaluate how the choice of an aggre- gation radius impacts the modular structure. After aggregating the data according to the radius chosen using networks analysis, we es- timated sampling probabilities using CMR. These probabilities allow the adjustment for sampling unevenness so that the difference in findings can be compared across locations and cannot be due to differences in sampling. We identified as species factories the locations with origination rate in the highest 5% after adjustment per time unit. Once the species factories had been identified, we looked for paleoecological patterns in these places that may be lacking elsewhere, finding that species factories present a lower number of findings and of different species among findings, but a higher ratio between the amount of different species and of total findings than the rest of the locations. This would indicate that, even if species factories might accommodate fewer species, they present a higher diversity. To make sure these results were not only due to chance, we performed the same analysis on 100 randomized experiments obtained using a modified version of the Curveball Algo- rithm and compared the values obtained from the original dataset and the ones obtained from the randomized ones. This comparison showed us that species factories tend to have more extreme values than the ones obtained through randomization, which would indicate that species factories present specific paleoecological patterns that are not present in other locations.
  • Dovydas, Kičiatovas (2021)
    Cancer cells accumulate somatic mutations in their DNA throughout their lifetime. The advances in cancer prevention and treatment methods call for a deeper understanding of carcinogenesis on the genetic sequence level. Mutational signatures present a novel and promising way to capture somatic mutation patterns and define their causes, allowing to summarize the mutational landscape of cancer as a combination of distinct mutagenic processes acting with different levels of strength. While the majority of previous studies assume an additive relationship between the mutational processes, this Master’s thesis provides tentative evidence that contemporary methods with additivity constraints, e.g. non-negative matrix factorization (NMF), are not sufficient to comprehensively explain the observed mutations in cancer genomes and the observed deviations are not random. To quantify these residues, two metrics are defined – additive and multiplicative residues – and hierarchical clustering algorithms are used to identify cancer subsets with similar residual profiles. It is shown that in certain cancer sample subsets there is a systematic mutational burden overestimation that can only be solved by a multiplicatively acting process, as well as non-random underestimation, requiring additional mutational signatures. Here an extension to the additive mutational signature model is proposed – a probabilistic model that incorporates a selectively active modulatory mutational process that is able to act in a multiplicative manner together with the known mutational signatures, reducing systematic variability.
  • Rautiainen, Mikko (2016)
    The genomes of all animals, plants and fungi are organized into chromosomes, which contain a sequence of the four nucleotides A, T, C and G. Chromosomes are further arranged into homologous groups, where two or more chromosomes are almost exact copies of each others. Species whose homologous groups contain pairs of chromosomes, such as humans, are called diploid. Species with more than two chromosomes in a homologous group are called polyploid. DNA sequencing technologies do not read an entire chromosome from end to end. Instead, the results of DNA sequencing are small sequences called reads or fragments. Due to the difficulty of assembling the full genome from reads, a reference genome is not always available for a species. For this reason, reference-free algorithms which do not use a reference genome are useful for poorly understood genomes. A common variation between the chromosomes in a homologous group is the single nucleotide polymorhpism (SNP), where the sequences differ by exactly one nucleotide at a location. Genomes are sometimes represented as a consensus sequence and a list of SNPs, without information about which variants of a SNP belong in which chromosome. This discards useful information about the genome. Identification of variant compositions aims to correct this. A variant composition is an assignment of the variants in a SNP to the chromosomes. Identification of variant compositions is closely related to haplotype assembly, which aims to solve the sequences of an organism's chromosomes, and variant detection, which aims to solve the sequences of a population of bacterial strains and their frequencies in the population. This thesis extends an existing exact algorithm for haplotype assembly of diploid species (Patterson et al, 2014) to the reference-free, polyploid case. Since haplotype assembly is NP-hard, the algorithm's time complexity is exponential to the maximum coverage of the input. Coverage means the number of reads which cover a position in the genome. Lowering the coverage of the input is necessary. Since the algorithm does not use a reference genome, the reads must be ordered in some other way. Ordering reads is an NP-hard problem and the technique of matrix banding (Junttila, PhD thesis, 2011) is used to approxiately order the reads to lower coverage. Some heuristics are also presented for merging reads. Experiments with simulated data show that the algorithm's accuracy is promising. The source code of the implementation and scripts for running the experiments are available online at https://github.com/maickrau/haplotyper.
  • Leppiniemi, Samuel Albert (2023)
    High-grade serous carcinoma (HGSC) is a highly lethal cancer type characterised by high genomic instability and frequent copy number alterations. This study examines the relationships between genetic variants in tumour germline and gene expression levels to obtain a better understanding of gene regulation in HGSC. This would then improve knowledge of the cancer mechanisms in order to find, for example, potential new treatment targets and biomarkers. The aim is to find significantly associated variant-gene pairs in HGSC. Expression quantitative trait loci (eQTL) analysis is a well-suited method to explore these associations. eQTL analysis is a suitable approach to analysing also those variants that are located in the non-coding genomic regions, as indicated by previous genome-wide association studies to contain many disease-linked germline variants. The current eQTL analysis methods are, however, not applicable for association testing between genes and variants in the context of HGSC because of the special genomic features of the cancer. Therefore, a new eQTL analysis approach, SegmentQTL, was developed for this study to accommodate the copy-number-driven nature of the disease. Careful input processing is of particular importance in eQTL as it has a notable effect on the number of significantly associated variant-gene pairs. It is also relevant to maintain adequate statistical power, which affects the reliability of the findings. In all, this study uses eQTL analysis to uncover variant-gene associations. This helps to improve knowledge of gene regulation mechanisms in HGSC in order to find new treatments. To apply the analysis to the HGSC data, a novel eQTL analysis method was developed. Additionally, appropriate input processing is important prior to running the analysis to ensure reliable results.
  • Korhonen, Teo Ilmari (2022)
    Flares are short, high-energy magnetic events on stars, including the Sun. Observations of young stars and red dwarfs regularly show the occurrence of flare events multiple orders of magnitude more energetic than even the fiercest solar storms ever recorded. As our technology remains vulnerable to disruptions due to space weather, the study of flares and other stellar magnetic activity is crucial. Until recently, the detection of extrasolar flares has required much manual work and observation resources. This work presents a mostly automatic pipeline to detect and estimate the energies of extrasolar flare events from optical light curves. To model and remove the star's background radiation in spite of complex periodicity, short windows of nonlinear support vector regression are used to form a multi-model consensus. Outliers above the background are flagged as likely flare events, and a template model is fitted to the flux residual to estimate the energy. This approach is tested on light curves collected from the stars AB Doradus and EK Draconis by the Transiting Exoplanet Survey Satellite, and dozens of flare events are found. The results are consistent with recent literature, and the method is generalizable for further observations with different telescopes and different stars. Challenges remain regarding edge cases, uncertainties, and reliance on user input.
  • Franssila, Fanni (2023)
    Magnetic reconnection is a phenomenon occurring in plasma and related magnetic fields when magnetic field lines break and rejoin, leading to the release of energy. Magnetic reconnections take place, for example, in the Earth’s magnetosphere, where they can affect the space weather and even damage systems and technology on and around the Earth. Another site of interest is in fusion reactors, where the energy released from reconnection events can cause instability in the fusion process. So far, 2D magnetic reconnection has been widely studied and is relatively well-understood, whereas the 3D case remains more challenging to characterize. However, in real-world situations, reconnection occurs in three dimensions, which makes it essential to be able to detect and analyse 3D magnetic reconnection, as well. In this thesis, we examine what potential signs of 3D magnetic reconnection can be identified from the topological elements of a magnetic vector field. To compute the topological elements, we use the Visualization Toolkit (VTK) Python package. The topology characterizes the behaviour of the vector field, and it may reveal potential reconnection sites, where the topological elements can change as a result of magnetic field lines reconnecting. The magnetic field data used in this thesis is from a simulation of the nightside magnetosphere produced using Vlasiator. The contributions of this thesis include analysis of the topological features of 3D magnetic reconnection and topological representations of nightside reconnection conditions to use in potential future machine learning approaches. In addition, a modified version of the VTK function for computing the critical points of the topology is created with the purpose of gearing it more towards magnetic vector fields instead of vector fields in general.
  • Santra, Sougata (2014)
    Wireless ad-hoc network does not rely on a preexisting infrastructure unlike the wired and managed wireless networks and hence are decentralized in nature. This decentralized nature makes it suitable for some services, which are mobile in nature. It is more scalable to design a service using a wireless ad-hoc network, where centralized design cannot be relied on. This makes it important to measure the performance of a wireless ad-hoc network to determine its theoretical and practical limits. A wireless network performance is dependent on a lot of factors, some of them are blindingly obvious while others are relatively obscure. Which makes the measurements of such limits challenging. In this study we review the previous works done so far, to determine the performance of wireless ad-hoc networks. Another purpose of this study is to calculate the theoretical limit of the performance of such networks. Finally we carry out experiments in real life conditions and use the results to find out the difference between theoretical and practical limits.
  • Setälä, Jakke (2020)
    Tämän työn tavoitteena on tutkia ihmisen laulamalla tuottamien vokaalien ominaisuuksia kouluissa saatavilla olevalla laitteistolla. Tarkoitus on havaita vokaalien erot niiden spektrejä tulkitsemalla ja mahdollistaa tämän tiedon sisällyttämistä koulujen fysiikan opetukseen. Työtä varten kerätään ääninäytteitä viidestä eri vokaalista molempia sukupuolia edustavilta vapaaehtoisilta kokelailta. Osa heistä on laulunopettajia, joilla on laaja kokemus ja osaaminen äänenkäytöstä. Osa on laulunopiskelijoita ja loput ovat maallikkoja, joilla on hyvin vähän tai ei ollenkaan kokemusta laulamisesta ja nk. oikeaoppisesta äänenkäytöstä. Tuloksista nähtiin, että laulettujen vokaalien erot ovat nähtävissä helpoiten M1 ja M2 (twang) mekanismeilla tuotetussa äänessä. Kuiskaamalla lausutut vokaalit erotettiin myös hyvin toisistaan. Havaittiin, että työ on toteutettavissa kouluissa välineistöstä riippumatta mobiililaitteiden tuomien mahdollisuuksien takia.
  • Mäkelä, Susanna (2023)
    Today, issues related to nature and the environment are increasingly topical and therefore also play a greater role in land use planning and urban planning. In this thesis, I study the different concepts of nature behind urban planning, i.e., the ways in which nature and its relationship to humans are defined. These many different conceptions of nature, such as dualism, materialism, idealistic conception of nature, external and frightening nature, economic and resource -focused thinking, ecomodernism, biodiversity perspective, posthumanism, continuum thinking or ecosystem service thinking, also have a direct and indirect impact on urban planning and what kind of nature and green areas are planned. In this thesis, I examine four partial general plans (in Finnish, osayleiskaava) for the city centre areas of Southeast Finland: the partial general plans for the centre of Kotka, the centre of Karhula, the city centre of Kouvola and the centre of Lappeenranta, which are all fairly recent plans. The main objective of my thesis is to find different conceptions of nature in the materials of these general plans, such as plan descriptions and various impact assessments. The method used in this analysis is qualitative content analysis. With the help of theory-based qualitative content analysis, I identify various conceptions of nature related to previous theory and research from the above-mentioned planning documents. In the analysis and results -section of my thesis, I attach the results of the content analysis to the theoretical framework concerning conceptions of nature. In the results of different conceptions of nature, especially the perspective of cultural ecosystem services, technology-oriented ecomodernism and biodiversity perspectives emerged from many text documents. The economic view of nature as a resource was also emphasised in many texts. In addition, dualistic conceptions that emphasize the dichotomy of man and nature and materialistic conceptions focusing on material reality were reflected as a broader way of thinking. On the other hand, posthumanist concepts related to equality between humans and nature are not as strongly visible in the results.