This work examines ocean surface conditions from high northern latitudes after the Last Glacial Maximum using marine fossil diatom assemblages. Long-term paleoclimatic and -oceanographic records are obtained from northern Svalbard and central-eastern Baffin Bay using quantitative and qualitative diatom analyses, and sediment grain size distribution analysis. An additional focus of this work was to study the ecology of common northern North Atlantic diatom species and define their relationship to environmental variables (aSSTs and sea ice) in order to identify the best indicator species for these environmental variables and to improve their reliability as paleoceanographic indicators.

The Baffin Bay study site was investigated for the deglacial period (10−14 kyr BP), and the results suggest a warmer ocean surface in central-eastern Baffin Bay during the cold Younger Dryas period (11.7−12.9 kyr BP) indicating that the ocean was out of phase with atmospheric conditions over Greenland. The warmer conditions were caused by enhanced inflow of Atlantic-sourced waters and increased solar insolation on the Northern Hemisphere, which amplified seasonality over Baffin Bay and had a significant role triggering the ice margin in West Greenland. The paleoceanographic record from northern Svalbard represents the late Holocene (last ca. 4 200 years), and the results show a clear climate shift at 2.5 kyr BP, as the study location changed from stable, glacier-proximal conditions into fluctuating glacier-distal conditions, emphasizing the sensitivity of the Arctic environment to climate oscillations. Understanding diatom species` relationship to environmental variables is essential and this work identifies robust indicators for cold, temperate and warm waters and for sea ice. The results show that not all sea ice-associated species have a statistically significant relationship to sea ice. While this species is often found in sea ice and in the marginal ice zone, its ecology appears to be more complex.

The paleoceanographic and –climatic records in this work give new insights to our current knowledge of past climate variability, and reform some of our current understanding of the past climate conditions on a local scale. This work also improves the applicability of the key northern North Atlantic diatom taxa as paleo-indicators, questioning previous knowledge on the ecology of some species and highlighting some important taxonomic issues.

]]>In Shelah's stability theory, a classifiable theory is a theory with an invariant that determines the structures up to isomorphisms, a theory with no invariant of this kind is a non-classifiable theory. This tell us that a theory with an invariant of this kind is less complex than a theory with no invariant of this kind. Shelah's stability theory tells us that every countable complete first-order classifiable theory is less complex than all countable complete first-order non-classifiable theories. The subject of study in this thesis is the question:

Are all classifiable theories less complex than all the non-classifiable theories, in the Borel reducibility hierarchy?

There are two frames where this question can be studied, the generalized Baire space and the generalized Cantor space. It is known that for every theory T, the isomorphism relation of T in the generalized Cantor space and the isomorphism relation of T in the generalized Baire space have the same complexity. This gives us the freedom to choose in which space we would like to work.

This question was studied by Friedman, Hyttinen, and Kulikov between others, in previous works. Some of the results in those works pointed out that the relation equivalence modulo the non-stationary ideal might be one of the keys to understand the reducibility of the isomorphism relations.

The work of Friedman, Hyttinen, and Kulikov leads to two approaches for the main question:

Is it provable in ZFC that in the generalized Cantor space, the isomorphism relation of T is Borel reducible to the equivalence modulo the non-stationary ideal, for T a classifiable theory? Is it provable in ZFC that in the generalized Cantor space, the equivalence modulo the non-stationary ideal is Borel reducible to the isomorphism relation of T, for T a non-classifiable theory?

Is it provable in ZFC that in the generalized Baire space, the isomorphism relation of T is Borel reducible to the equivalence modulo the non-stationary ideal, for T a classifiable theory? Is it provable in ZFC that in the generalized Baire space, the equivalence modulo the non-stationary ideal is Borel reducible to the isomorphism relation of T, for T a non-classifiable theory?

The work of Friedman, Hyttinen, and Kulikov gives a partial answer to this question. At the same time this points out to a question that might be the key to understand the the connection between classification theory and the Borel reducibility hierarchy:

Does the equivalence modulo the non-stationary ideal has the same complexity in the generalized Cantor space as in the generalized Baire space? It is known that the isomorphism relations have the same complexity in the generalized Cantor space as in the generalized Baire space.

These are the questions studied in this thesis.

]]>A common analysis workflow of RNA-sequencing (RNA-seq) data consists of mapping the sequencing reads to a reference genome, followed by the transcript assembly and quantification based on these alignments. The advent of second-generation sequencing revolutionized the field by reducing the sequencing costs by 50,000-fold. Now another revolution is imminent with the third-generation sequencing platforms producing an order of magnitude higher read lengths. However, higher error rate, higher cost and lower throughput compared to the second-generation sequencing bring their own challenges. To compensate for the low throughput and high cost, hybrid approaches using both short second-generation and long third-generation reads have gathered recent interest.

The first part of this thesis focuses on the analysis of short-read RNA-seq data. As short-read mapping is an already well-researched field, we focus on giving a literature review of the topic. For transcript assembly we propose a novel (at the time of the publication) approach of using minimum-cost flows to solve the problem of covering a graph created from the read alignments with a set of paths with the minimum cost, under some cost model. Various network-flow-based solutions were proposed in parallel to, as well as after, ours.

The second part, where the main contributions of this thesis lie, focuses on the analysis of long-read RNA-seq data. The driving point of our research has been the Minimum Path Cover with Subpath Constraints (MPC-SC) model, where transcript assembly is modeled as a minimum path cover problem, with the addition that each of the chains of exons (subpath constraints) created from the long reads must be completely contained in a solution path. In addition to implementing this concept, we experimentally studied different approaches on how to find the exon chains in practice. The evaluated approaches included aligning the long reads to a graph created from short read alignments instead of the reference genome, which led to our final contribution: extending a co-linear chaining algorithm from between two sequences to between a sequence and a directed acyclic graph.

]]>Methane is an important greenhouse gas, strongly influenced by anthropogenic activities, whose atmospheric concentration increased more than twice since pre-industrial times. Although its source and sink processes have been studied extensively, the mechanisms behind the increase in the 21st century atmospheric methane concentrations are still not fully understood. In this thesis, contributions of anthropogenic and natural sources to the increase in the atmospheric methane concentrations are studied by estimating the global and regional methane fluxes from anthropogenic and biospheric sources for the 21st century using an EnKF based data assimilation system (CarbonTracker Europe-CH4 ; CTE-CH4). The model was evaluated using assimilated in situ atmospheric concentration observations and various non-assimilated observations, and the model sensitivity to several setups and inputs was examined to assess the consistency of the model estimates.

The key findings of this thesis include: 1) large enough ensemble size, appropriate prior error covariance, and good observation coverage are important to obtain consistent and reliable estimates, 2) CTE-CH4 was able to identify the locations and sources of the emissions that possibly contribute significantly to the increase in the atmospheric concentrations after 2007 (the Tropical and extra Tropical anthropogenic emissions), 3) Europe was found to have an insignificant or negative influence on the increase in the atmospheric CH4 concentrations in the 21st century, 4) CTE-CH4 was able to produce flux estimates that are generally consistent with various observations, but 5) the estimated fluxes are still sensitive to the number of parameters, atmospheric transport and spatial distribution of the prior fluxes.

]]>With a series of radar images, a human (subjectively) or a computer (objectively) can process this information to estimate where the rain will move and be located within the next few minutes (even hours), i.e. a short forecast also called "nowcast". This applies to some extent also for other observations, such as satellite data (cloud propagation). But for most quantities (such as temperature, wind, etc) it is significantly harder to make such a nowcast, since these are influenced by many other factors and there is no linear development of them. Therefore, there are forecast models that solve physical and dynamic equations, so that one can estimate the future weather for the coming hours and days.

A prerequisite for generating a forecast of high quality is to capture the initial weather conditions as best as possible. This is done using observations and they are introduced into the forecast model through different techniques, where the model creates its own analysis as the initial step. There remain problems since forecast models often are affected by physical disagreements, as the dynamic conditions are not in balance. This results in the model having a spin-up effect, where the meteorological quantities are not yet in balance with each other and the resulting weather conditions are not always reliable during the first hours. Hence, a lot of research is spent on how to reduce this spin-up effect and on the use of nowcast models, in order to deliver the best model results for the first few hours of the forecast period.

In this dissertation, the research work has been to improve the meteorological analysis, algorithms and functionality, using the Local Analysis and Prediction System (LAPS) model. Different kinds of observations were used and their interdependencies have been studied, in order to combine and merge information from various instruments. Primarily focus has been to improve the estimation of precipitation accumulation and meteorological quantities that affect wind energy. The LAPS developments have been used for several end-users and nowcasting applications, and experimentally as initial conditions for forecast modelling. The studies have been concentrated on Finland and nearby sea areas, with the available datasets for this domain.

By combining surface-station measurements, radar and lightning information, one can improve the precipitation-amount estimations. The use of lightning data further improves the estimates and gives the advantage of having additional data outside radar coverage, which can potentially be very useful for example over sea areas. In addition, the improved LAPS analyses (cloud-related quantities) and a newly developed model (LOWICE), calculating the electricity production during wintertime (e.g. icing which reduces efficiency), have shown good results.

]]>