Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Title

Sort by: Order: Results:

  • Pöyhönen, Julia Rosanna Hellin (2013)
    Charcot-Marie-Tooth (CMT) neuropathy is phenotypically and genetically a very heterogeneous disease. It can be inherited as an autosomal recessive, dominant or X-linked trait. CMT is characterized by distal muscle weakness, atrophy and deformity of the feet as well as clumsiness of gait. The onset of CMT varies and also the symptoms of the disease can vary even among the members of a single family. So far more than 40 genes have been identified for CMT and the list is estimated to grow by 30-50 genes. Whole exome sequencing (WES) is a new next generation sequencing technique, which targets the protein-coding area of the genome. Through WES analysis it is possible to search for disease causing mutations with all kinds of inheritance patterns. Patients suffering from CMT are good candidates for WES analysis because of the genetic heterogeneity of their disease. WES can be used for diagnosing Mendelian disorders with atypical symptoms as well as diseases, which are difficult to confirm using clinical criteria alone and which require costly evaluation, e.g. CMT. In this master study new disease causing mutations for early-onset neuropathies are identified by whole exome sequencing (WES). The aims of this study include using WES for the molecular diagnosis of four patients suffering from early-onset axonal neuropathies, the functional analysis of possible causative variants and improving and developing the process of analyzing variants from whole exome sequencing data, especially the analyzing steps of insertion and deletion variants. Finding causative variants among the insertion and deletion variants has previously been often left out from the WES analysis because of the lack of systematic analysis technique. As a result of the WES data analysis a new candidate disease gene, tripartite motif containing 2 (TRIM2) was identified. A missense mutation c.761T>A (p.E254V) and a deletion c.1779delA (p.K594Rfs7X) were found in patient 2, who suffers from severe CMT type 2. The carrier frequency was analysed to see whether the variants are present in the general population or not. The functional analysis of TRIM2 was started by preparing constructs carrying the missense mutation and the deletion and by setting up conditions for western blotting.
  • Toukola, Peppi (2021)
    In this thesis the suitability of Nuclear Magnetic Resonance (NMR) spectroscopy in the identification of rubbers in museum collections is discussed through a literature review and experimental work where samples from the rubber collection of Tampere Museums were analysed with different NMR techniques. The literature part of this thesis focuses on recent (2011-2020) scientific publications on analytical instrumental techniques used in the identification of cultural heritage plastics. Vibrational spectroscopy methods utilizing hand-held or portable devices have been the most prominent methods used in characterization of historical plastics materials. Bench-top devices and analytical techniques requiring sampling were used to acquire more detailed analysis results. However, NMR spectroscopy was not used as the main analysis technique in the reviewed publications. In the experimental part altogether 21 rubber object samples and 8 reference samples were identified using 1D and 2D NMR techniques in solution state. Three samples were additionally analysed with solid-state High Resolution Magic Angle Spinning (HRMAS) NMR spectroscopy. The chemical structures of the samples were confirmed with these methods. To further explore fast and more automated identification of the rubber samples a statistical classification model utilizing acquired solution-state 1H NMR data was developed. Three rubber types were chosen for the analysis. The model was created using analysis data from the museum object samples and validated using the reference sample data. Identification rate of 100 % was achieved.
  • Varvarà, Giulia (2022)
    Species factories are defined as times and places in the fossil record where and when an exceptionally large number of new species occurs. While several tailored solutions for the mammalian record have been proposed, how to identify species factories computationally in a standardized way is still an open question. To quantify what is exceptional, we first need to quantify what is regular. One of the main challenges in this identification process is to account for sampling unevenness, which depends on several methodological decisions, including the scale of the analysis (aggrega- tion radius). In this thesis we used Capture-Mark-Recapture methods (CMR) with spatial aggregation guided by network modelling, to estimate the sampling probabilities for the species in the NOW database of mammalian fossil occurrences. Since the mammalian record is sparse and most localities include only a few species, we coupled CMR with tailored spatial aggregation approaches to estimate the sampling prob- abilities. We then used these sampling probabilities to quantify background speciation rates and assess what rates are abnormal. We represented aggregated fossil data as a bipartite network and used community detection to evaluate how the choice of an aggre- gation radius impacts the modular structure. After aggregating the data according to the radius chosen using networks analysis, we es- timated sampling probabilities using CMR. These probabilities allow the adjustment for sampling unevenness so that the difference in findings can be compared across locations and cannot be due to differences in sampling. We identified as species factories the locations with origination rate in the highest 5% after adjustment per time unit. Once the species factories had been identified, we looked for paleoecological patterns in these places that may be lacking elsewhere, finding that species factories present a lower number of findings and of different species among findings, but a higher ratio between the amount of different species and of total findings than the rest of the locations. This would indicate that, even if species factories might accommodate fewer species, they present a higher diversity. To make sure these results were not only due to chance, we performed the same analysis on 100 randomized experiments obtained using a modified version of the Curveball Algo- rithm and compared the values obtained from the original dataset and the ones obtained from the randomized ones. This comparison showed us that species factories tend to have more extreme values than the ones obtained through randomization, which would indicate that species factories present specific paleoecological patterns that are not present in other locations.
  • Jonkka, Susanna (2016)
    Ovarian cancer is known as "the silent killer" because it is generally diagnosed at a late stage, and is therefore responsible for more deaths than any other gynecological malignancy. Although the genetic background of high-grade serous carcinoma (HGSC) is highly heterogeneous, almost all HGSCs harbor TP53 mutations, and mutations in BRCA1 and BRCA2 are also frequent. Less is known about the chromosomal rearrangements that function as drivers of HGSC. The aim of this thesis project was to identify and validate novel and recurrent chromosomal rearrangements that may have a functional relevance in the tumorigenesis of high-grade serous carcinoma. We searched for recurrent rearrangements detected by a computational algorithm (BreakDancer) in 44 HGSC whole-genome sequences that were obtained from The Cancer Genome Atlas database. We identified five samples to harbor a novel region that was affected by recurrent deletions of similar size. This region was located upstream of the gene TUBB4A on chromosome 19. We used PCR to screen for rearrangements within this region in 11 Finnish patient tumor tissues. None of these samples displayed rearrangements within this region. Further studies with larger sample sizes are required to validate whether this region indeed is recurrently affected by chromosomal abnormalities. Identifying chromosomal rearrangements of functional relevance will pave the way towards the use of personalized medicine.
  • Zhang, Teng (2015)
    The architecture of inflorescence refers to the spatio-temporal arrangement of flowers on the reproductive branches. Flowering plants have evolved great diversity in such branching systems. Among which, the showy capitulum type inflorescence in the large Compositae (Asteraceae) species is regarded as a prerequisite factor for their wide spreading around the world. Different from the simple raceme and cyme, capitulum compresses hundreds of individual florets on its receptacle, but overall resembles a single, solitary flower. The ontogeny of capitulum also bears resemblance to a single flower, with regard to the meristem determinacy, floral sequence and histological configurations. Recent molecular studies have revealed that a plant specific transcription factor LEAFY (LFY), is required for both the floral initiation and floral patterning, the two essential steps to form an inflorescence. The thesis elaborates Gerbera hybrida as a model to elucidate functions of the LFY ortholog during the development of inflorescence/flower in a capitulum background. In addition to the conserved functions in regulating floral meristem identity and floral patterning, three specified functions were revealed by transgenic Gerbera with down-regulated expression of GhLFY. Firstly, GhLFY is involved in the regulating the floral initiation of marginal ray florets. Down regulation of GhLFY resulted the marginal ray florets revert into a branching patterm that shown on the capitulum of Calyceraceae, the close relatives of Asteraceae. Secondly, the determinacy of IM is disrupted when GhLFY loses its functions, suggesting that GhLFY may function at both the flower and inflorescence interfaces. Thirdly, different flower types show specific responses to GhLFY down-regulation in floral patterning, indicating that there exist a potential genetic gradient among different flower types. At protein level, the LFY functions are specified by formation of versatile protein complexes with its transcriptional co-regulators. In Gerbera, GhLFY proteins tend to form homodimers and they were also capable to interact with a conserved transcriptional co-regulator, the UNUSUAL FLORAL ORGANS (UFO) ortholog GhUFO. Taken advantage of the forward Y2H library screening, 6 additional proteins were identified to interact with GhLFY, including several novel potential co-regulators of LFY that has not yet been identified in other species. Additionally, a bimolecular fluorescence complementation assay (BiFC) was optimized to verify the GhLFY self-interaction in planta.
  • Ristolainen, Heikki; Kilpivaara, Outi; Kamper, Peter; Taskinen, Minna; Saarinen, Silva; Leppä, Sirpa; d'Amore, Francesco; Aaltonen, Lauri A. (2015)
    Tutkimuksessamme tarkastelimme Lähi-idästä lähtöisin olevaa perhettä, jossa kolmella viidestä lapsesta on todettu nuorellä iällä klassinen Hodgkinin lymfooma (cHL). Perinnöllinen alttius cHL:lle tunnetaan huonosti, eikä taudille mahdollisesti altistavia geenimuutoksia ole aiemmin raportoitui kuin yksi kappale. Geenimuutosten selvittämiseksi eksomisekvensoimme kolmen sairastuneen lapsen verinäytteestä eristetyn DNA:n ja poimimme joukosta kaikkien kolmen jakamat muutokset. Suodatimme lasten jakamien DNA-muutosten joukosta pois omissa vertailujoukoissamme ja useissa julkisissa tietokannoissa esiintyvät geneettiset muutokset ja arvioimme jäljellejääneiden muutosten haitallisuutta kahdella laskennallisella priorisaatioalgoritmilla. Näin saimme järjestettyä jäljelle jääneet 35 jaettua muutosta laskennalliseen haitallisuusjärjestykseen. Jaetuista muutoksista merkittävimmäksi nousi ACAN-geenissä oleva homotsygoottinen 57 emäksen pituinen deleetio c.2836_2892del, jota ei ole aiemmin liitytty cHL-fenotyyppiin.
  • Siskovs, Klims (2021)
    STK11/LKB1 is a tumor suppressor gene and mutated in 18% of lung adenocarcinomas. Tumor suppressor liver kinase B1 (LKB1) is known to activate adenosine monophosphate-activated protein kinase (AMPK) and 12 AMPK-related kinases (ARKs) by phosphorylating a conserved threonine residue in their T-loop region. A number of studies focused on investigating the influence of LKB1-AMPK signaling on cancer cell proliferation. However, there is no systematic study for identifying the critical LKB1 kinase substrates in suppressing lung cancer cell growth. In this project, the LKB1-deficient lung adenocarcinoma cell line A549 cells were sequentially overexpressed with constitutively active mutants of AMPKα1, AMPKα2, MARK1, MARK2, MARK3, MARK4, NUAK1, NUAK2, SIK1, SIK2, SIK3. The overexpression status was confirmed at both genetic and protein levels by qPCR and Western blotting, correspondingly. In vitro growth assays demonstrated up to 33% reduced growth rate of A549 cells overexpressing AMPKα1, AMPKα2 and NUAK1. Furthermore, siRNA knockdown of the selected substrates in LKB1-overexpressing A549 cells significantly rescued the cell growth defect. These findings suggest, that AMPKα1, AMPKα2 and NUAK1 kinases are critical for LKB1-mediated cell growth defect in lung adenocarcinoma.
  • Dovydas, Kičiatovas (2021)
    Cancer cells accumulate somatic mutations in their DNA throughout their lifetime. The advances in cancer prevention and treatment methods call for a deeper understanding of carcinogenesis on the genetic sequence level. Mutational signatures present a novel and promising way to capture somatic mutation patterns and define their causes, allowing to summarize the mutational landscape of cancer as a combination of distinct mutagenic processes acting with different levels of strength. While the majority of previous studies assume an additive relationship between the mutational processes, this Master’s thesis provides tentative evidence that contemporary methods with additivity constraints, e.g. non-negative matrix factorization (NMF), are not sufficient to comprehensively explain the observed mutations in cancer genomes and the observed deviations are not random. To quantify these residues, two metrics are defined – additive and multiplicative residues – and hierarchical clustering algorithms are used to identify cancer subsets with similar residual profiles. It is shown that in certain cancer sample subsets there is a systematic mutational burden overestimation that can only be solved by a multiplicatively acting process, as well as non-random underestimation, requiring additional mutational signatures. Here an extension to the additive mutational signature model is proposed – a probabilistic model that incorporates a selectively active modulatory mutational process that is able to act in a multiplicative manner together with the known mutational signatures, reducing systematic variability.
  • Domènech Moreno, Eva (2017)
    In this Master’s project, I have studied a mammalian serine-threonine kinase NUAK2 implicated in human disease but whose molecular functions and interacting proteins are as of yet poorly characterized. The goal was to identify new interacting proteins to increase understanding of the molecular functions and potentially link to human physiology and disease. Recent work from the host lab shows NUAK2 loss in cultured primary cells mimics loss of the tumor suppressor LKB1 which also acts upstream of NUAK2, together suggesting NUAK2 could be involved in tumor suppression. Currently, only two protein-protein interacting proteins with NUAK2 have been identified: NUAK2 is targeted to actin stress fibers by the myosin phosphatase Rho-interacting protein (MRIP), and it is involved in regulating cell contractility by affecting indirectly the phosphorylation cycle of the myosin light chain through inactivation of the myosin phosphatase target subunit 1 (MYPT1). In this project, I utilized a novel protein-protein interaction screening method that utilizes proximity-dependent biotin labeling to identify new interacting proteins with NUAK2 in human embryonic kidney cells (HEK 293). This method is based on fusing an E.Coli promiscuous biotin ligase, BirA*(R118G), to the investigated protein. The BirA*(R118G) ligase biotinylates all the proteins in close proximity of the fusion protein creating a history of protein-protein associations over time. Afterwards, the biotinylated proteins can be isolated by affinity purification methods and identified by mass-spectrometry. The screening identified the previously known interaction partners of NUAK2 indicating it was technically successful. In addition, I also identified in total 108 novel potential protein interaction partners for NUAK2. One of the top hits was Cytospin-A, a cross-linking protein between microtubules and actin cytoskeleton, supporting a role of NUAK2 as regulator of cytoskeleton. Supporting the validity of our finding, Cytospin-A depletion in mammalian cells causes defective actin-cytoskeleton reorganization, a very similar phenotype seen with NUAK2 depletion. In future studies, I will continue to investigate the specific role of NUAK2 and Cytospin-A aiming for detailed information on the function of NUAK2 in regulation of microtubules and actin cytoskeleton. Validation of some of the other identified interactions is expected to provide novel insights to the biology and role of NUAK2 in LKB1 tumor suppressor functions.
  • Huusari, Anna (2018)
    Plants control the exchange of gases through the stomatal pores. Stomata are formed by guard cells and the closure of stomata are regulated via a complex signaling network in response to various biotic and abiotic stimuli, such as pathogens, elevated levels of CO2 and darkness. The leucine-rich repeat receptor-like kinase (LRR-RLK) GUARD CELL HYDROGEN PEROXIDE-RESISTANT1 (GHR1) is part of the network regulating stomatal closure. GHR1 is an inaktive pseudokinase that can activate SLOW ANION CHANNEL-ASSOCIATED1 (SLAC1), an anion channel that is crucial for stomatal closure, via interacting proteins. The exact role of GHR1 is still partly unknown, however, it has been suggested that GHR1 could function as a scaffold or as an allosteric regulator of additional components required for stomatal closure. The aim of this study was to identify novel interactors of GHR1. First stable plant lines expressing fusion proteins GHR1-YFP, GHR1W799*-YFP and plain YFP as a negative control were generated and from these lines fusion protein expression levels and the subcellular localization were studied. Next the plant lines were used for purifying GHR1 interacting proteins with the use of co-immunoprecipitation and identification of the proteins with mass spectrometry. The unlikely GHR1 interactor candidates were then filtered from the mass spectrometry data. The subcellular localization and the protein expression of the interacting proteins were studied with the use of internet databases. Literature of the GHR1 interacting proteins were studied in order to make possible connections with GHR1 and stomatal closure. In this study 38 GHR1 interactors were identified. Literature search revealed that many of the identified interactors had a known role in stomatal movements. These included proteins such as PLASMA MEMBRANE INTRINSIC PROTEIN2-1 (PIP2-1) and BETA CARBONIC ANHYDRASE 4 (BCA4), that are known to have a role in stomatal closure. Future work includes confirming the interactions with independent methods and studying the molecular mechanisms related to stomatal movements. The GHR1 interactome identified here for the first time reveals novel parts of the network regulating stomatal movements and thus increases our understanding of molecular mechanisms behind stomatal functions.
  • Doagu, Fatma (2013)
    Intellectual disability (ID) is a clinically diverse and genetically heterogeneous disorder characterized by central nervous system defects of varying severity resulting in substantial impairment of intellectual and adaptive functioning as expressed in conceptual (IQ<70), social and practical adaptive skills diagnosed before 18 years of age. The condition is referred to as non-syndromic when ID is the only clinical feature and syndromic when ID is accompanied by specific other features, for example, Down syndrome. Intellectual disability is one of the largest unsolved problems of health care with a prevalence of 2-3% in the population. There is a 30-40% excess of male versus female patients in ID which refers to over-representation of X chromosomal defects causing ID. In this study, exome sequencing of the X chromosome was applied in order to identify genes and their mutations in two Finnish families with intellectual disability of unknown cause. The mutations were identified using Agilent Sure select array that covers almost 93% of the coding region of the chromosome. Exome sequencing resulted in 11 variations in total. Segregation of these variants was studied using PCR, ExoSAP-IT purification protocol and BigDye® Terminator v3.1 Cycle Sequencing Kit. Eventually, two novel mutations were identified: one for each family. Both mutations reside in genes that have previously been shown to cause X-linked intellectual disability. Both of the mutations were absent in over 120 control DNA samples. In one family with three affected males, a novel splice mutation was identified in discs large homolog 3 (DLG3), which encodes synapse-associated protein 102 (SAP102). The mutation is located at the splice site in intron 1 (500+1 G>C) and its effect on protein function needs to be analyzed at the RNA-level using cDNA-sequencing. The clinical phenotype of the three affected brothers is mild to moderate intellectual disability. In the other family with three severely affected male patients, a novel mutation in exon 12 was identified on glutamate receptor, ionotropic, AMPA 3 (GRIA3) resulting in amino acid glycine (GGG) changing to arginine (CGG) at codon 630 (G630R). GRIA3 belongs to AMPA receptors implicated in the regulation of several biological processes. Our findings elucidate the power of exome sequencing in the diagnosis of rare, genetically heterogeneous disorders like intellectual disability. The results obtained will help in assessing the prognosis of the disease, in estimating the risk of the disorder to other family members, and in facilitating the development of future therapies for these devastating disorders. The results also further confirm the role of DLG3 and GRIA3 in human cognitive development.
  • Adunola, Paul Motunrayo (2021)
    Lipoxygenase enzymes, which contribute significantly to storage protein in legume seeds have been reported to cause the emission of volatile compounds associated with the generation of off-flavours. This is an are important factor limiting the acceptance of faba bean (Vicia faba) I foods. This study aimed at using bioinformatic tools to identify seed-borne lipoxygenase (LOX) genes and to design a biological tool using molecular techniques to find changes in sequence in faba bean lines. LOX gene mining by Exonerate sequence comparison on the whole genome sequence of faba bean was used to identify six LOX genes containing Polycystin-1, Lipoxygenase, Alpha-Toxin (PLAT) and/or LH2 LOX domains. Their sequence properties, evolutionary relationships, important conserved LOX motifs and subcellular location were analysed. The LOX gene proteins identified contained 272 – 853 amino acids (aa). The molecular weight ranged from 23.67 kDa in Gene 6 to 96.45 kDA in Gene 1. All the proteins had isoelectric points in the acidic range except Genes 6 and 7 which were alkaline. Only one gene had both LOX conserved domains with aa sequence length similar with that found in soybean and pea LOX genes and isoelectric properties with soybean LOX3. Phylogenetic analysis indicated that the genes were clustered into 9S LOX and 13S LOX types alongside other seed LOX genes in some legumes. Five motifs were found, and sequence analysis showed that three genes (Gene 1, 2 and 3) contained the 38-aa residue motif that includes five histidine residues [His-(X)4-His-(X)4-His-(X)17-His-(X)8-His]. The subcellular localization of the lipoxygenase proteins was predicted to be primarily the cytoplasm and chloroplast. Primers covering ~1.2 kb were designed, based on the conserved region of Genes 1, 2 and 3 nucleotide sequences. Gel electrophoresis showed the PCR amplification of the seed LOX gene at the expected region for twelve faba bean lines. Phylogenetic analysis showed evolutionary divergence among faba bean lines for sequenced and amplified region of their respective seed LOX alleles.
  • Nyhamar, Ellisiv (2022)
    S. aureus infections are prominent worldwide, and with the rapid increase in antimicrobial resistant variants such as methicillin-resistant MRSA, the need for new treatment alternatives is imminent (Monaco et al., 2017). Lytic bacteriophages are continually evolving new methods for the destruction of bacterial cells while avoiding their defence mechanisms. Screening hypothetical proteins of unknown function (HPUFs) from bacteriophages for toxic activity against bacteria may provide new and potentially life-saving approaches to combat bacterial infections (Liu et al., 2004, Singh et al., 2019). The Stab21 phage of Staphylococcus is a recently described lytic phage with over 85 % of its open reading frames annotated as HPUFs (Oduor et al., 2019). The successful identification of potentially toxic gene products could facilitate the discovery of novel bacterial targets for the development of new antimicrobials. It could also provide treatment options to multi-drug resistant S. aureus caused infections where no effective drugs are currently available. To reduce unnecessary screening of phage particle associated yet poorly annotated proteins, total proteins of phage particle were previously identified by LC-MS. Similar studies have previously been performed with Yersinia phage fR1-RT and Klebsiella phage fHe-Kpn01, where a handful of toxic proteins were discovered (Mohanraj et al., 2019, Spruit et al., 2020). To accelerate the screening process, a next-generation sequencing (NGS) high-throughput screening method was further developed by Kasurinen et al. (2021). In this study, 96 true HPUFs were selected and screened for their bactericidal activity in E. coli using the NGS-based approach. Fourteen potentially bacteriotoxic Stab-21 gene products were identified through toxicity screening in E. coli. Of these, three had a particularly low ratio of isolated plasmid after transformation while having a significant number of reads over each joint sequence, indicating their potentially high toxicity. The three most promising candidates were the gene products of g008, g081c and g175 of the Stab21 bacteriophage.
  • Andsten, Rose-Marie (2020)
    Bacteria are a great source of natural products with complex chemical structures and diverse biological activities. Many have therapeutic properties and half of drugs in clinical use today are derived directly or indirectly from natural products. The pharmaceutical industry stopped investing in drug development from natural resources, due to perceived limitations in chemical space, and difficulties in rediscovery of known compounds and in obtaining sufficient quantities of natural products for clinical trials. There is now renewed interest in natural products as drug leads driven by technological advances in genome sequencing and analytical chemistry. Cyanobacteria produce a variety of natural products with therapeutic potential. Muscoride A is an unusual peptide alkaloid produced by a terrestrial freshwater cyanobacterium with reported antimicrobial activity. The aim of this study was to characterize the biosynthetic origin and biological activity of muscoride A. I identified the 12.7 kb muscoride (mus) biosynthetic gene cluster from a draft genome of Nostoc sp. PCC 7906 using bioinformatics analysis. The mus biosynthetic gene cluster encoded enzymes for the heterocyclization, oxidation and prenylation of a precursor protein. Comparative genomics identified a mus biosynthetic gene cluster in the unpublished draft genome of Nostoc sp. UHCC sp. 0398 encoding a novel muscoride. This novel muscoride, muscoride B, was detected from Nostoc sp. UHCC 0398 based on this analysis. Muscoride B was purified using solid phase extraction and high-performance liquid chromatography and the chemical structure was verified by combining nuclear magnetic resonance and mass spectrometry data. Furthermore, the function and evolutionary history of the muscoride prenyltransferases were studied. A significant finding was that the biosynthetic pathway encodes two regiospecific prenyltransferases, catalyzing the C- and N-terminal prenylation of muscoride. An antimicrobial activity screening showed that muscoride B had antimicrobial activity against Bacillus cereus. Here I report the discovery of the muscoride biosynthetic pathway and the discovery of a novel antimicrobial peptide from cyanobacteria through genome mining. The results show that the variant is a novel muscoride, a linear bis-prenylated polyoxazole pentapeptide with antimicrobial activity.
  • Rautiainen, Mikko (2016)
    The genomes of all animals, plants and fungi are organized into chromosomes, which contain a sequence of the four nucleotides A, T, C and G. Chromosomes are further arranged into homologous groups, where two or more chromosomes are almost exact copies of each others. Species whose homologous groups contain pairs of chromosomes, such as humans, are called diploid. Species with more than two chromosomes in a homologous group are called polyploid. DNA sequencing technologies do not read an entire chromosome from end to end. Instead, the results of DNA sequencing are small sequences called reads or fragments. Due to the difficulty of assembling the full genome from reads, a reference genome is not always available for a species. For this reason, reference-free algorithms which do not use a reference genome are useful for poorly understood genomes. A common variation between the chromosomes in a homologous group is the single nucleotide polymorhpism (SNP), where the sequences differ by exactly one nucleotide at a location. Genomes are sometimes represented as a consensus sequence and a list of SNPs, without information about which variants of a SNP belong in which chromosome. This discards useful information about the genome. Identification of variant compositions aims to correct this. A variant composition is an assignment of the variants in a SNP to the chromosomes. Identification of variant compositions is closely related to haplotype assembly, which aims to solve the sequences of an organism's chromosomes, and variant detection, which aims to solve the sequences of a population of bacterial strains and their frequencies in the population. This thesis extends an existing exact algorithm for haplotype assembly of diploid species (Patterson et al, 2014) to the reference-free, polyploid case. Since haplotype assembly is NP-hard, the algorithm's time complexity is exponential to the maximum coverage of the input. Coverage means the number of reads which cover a position in the genome. Lowering the coverage of the input is necessary. Since the algorithm does not use a reference genome, the reads must be ordered in some other way. Ordering reads is an NP-hard problem and the technique of matrix banding (Junttila, PhD thesis, 2011) is used to approxiately order the reads to lower coverage. Some heuristics are also presented for merging reads. Experiments with simulated data show that the algorithm's accuracy is promising. The source code of the implementation and scripts for running the experiments are available online at https://github.com/maickrau/haplotyper.
  • Roininen, Aino Elina Sylvia (2018)
    Raspberry is prone to virus infections but diversity and occurrence of different raspberry viruses in Finland is still largely unknown. The purpose of this thesis work was to reveal which viruses are troubling raspberry varieties that are part of Finnish raspberry genetic resources and have been maintained in vivo on the field. The study also examines and compares siRNA diagnostics to traditional PCR method in detection of raspberry viruses. PCR, cloning, and traditional Sanger sequencing are used to get more detailed information of the virus strains, and scratch the surface of phylogenetic diversity of raspberry viruses that were detected in this study. siRNA detection of was accurate and effective with Black raspberry necrosis virus (BRNV), Raspberry bushy dwarf virus (RBDV) and Rubus yellow net virus (RYNV), but VirusDetect program couldn’t find Raspberry vein chlorosis virus (RVCV) that was positive in Velvet analysis and molecular diagnostics. Sequencing and phylogenetic analysis revealed RYNV strain that was like Canadian isolate (KF241951.1). Many new Badnavirus-like sequences were detected, but possible integration of Badnaviruses into raspberry genome was not excluded in this study. BRNV isolates were closely related to previously detected Finnish BRNV isolates. RVCV isolates grouped to three clades where RVCV from sample 21 was the most similar to formerly sequenced Scottish isolate (FN812699.2). Multiple viral infections were detected in one sample amongst these raspberry varieties, which may indicate different kinds of virus-virus interactions. The most important finding of this study was that RYNV and RVCV are present in Finland. Secondly all the detected raspberry viruses are genetically diverse and multiple infections of detected virus species are common.
  • Keränen, Fanny (2021)
    This study aimed to identify conservation landscapes with potential to be mutually beneficial for people and African savanna elephants (Loxodonta africana) in South Africa through spatial conservation planning analyses that integrate ecological and socioeconomic data. The research questions were: (i) what are the most ecologically suitable areas for the reintroduction of elephants, and (ii) which of these areas provide the best opportunities for also sustaining socioeconomic development of local people. The first question was answered with an ecological model that predicts habitat suitability for elephants, developed by a combination of literature review, expert opinion, and GIS-based methods. The second question was answered by combining the ecological model with socioeconomic criteria in Zonation spatial conservation planning software. The results show that the central part of South Africa holds most potential for elephant conservation as it has the largest uniform area of high-quality habitat, while the area also meets the socioeconomic criteria. The priority areas for the conservation of elephants were classified into top priority classes of 1%, 2%, 5%, 10% and 20%. The identified areas hold an unrealized opportunity in the wildlife and ecotourism sectors, and the reintroduction of elephants to those areas could provide the foundation for long-term economic activity of local communities e.g. in the form of elephant-based ecotourism, while contributing to the conservation of elephants. Conserving just the top 5% priority areas would grow South African protected area estate by approximately three million hectares and increase the current elephant range by approximately 75%. Ideally, the results of this study could be used to inform the on-going decision-making process on where to allocate resources for elephant conservation in South Africa.
  • Tripathi, Shivanshi (2020)
    Multiple Myeloma (MM) is the second most common hematologic malignancy. Despite the advancements in treatment approaches in the last decade, the prevalence of refractory disease leading to relapsed cases has been a major challenge. A wide range of intricate genetic heterogeneity demonstrated by myeloma patients is a credible explanation for the diverse treatment responses observed in patients sharing the same treatment regimens. Pertaining to this, the study aims to identify predictive gene expression biomarkers that forecast response to BCL2 inhibitor venetoclax and treatment outcome to proteasome inhibitor bortezomib. In this study, samples from MM patients were characterized into sensitive and resistant, (1) based on ex vivo response to venetoclax treatment (Resistant n=21; Sensitive n=21), and (2) based on their bortezomib treatment outcome in clinical profiles (Resistant n=12; Sensitive n=15). Associations between the different gene expressions and drug responses were studied using statistical and bioinformatic tools. As a result, we identified that significant (p-value <0.05) overexpression of 36 genes and downregulation of 38 genes appeared to confer resistance to venetoclax drug response in MM patients. Additionally, the functional association of these genes with pathways was determined using a pathway enrichment tool. Furthermore, the study provided evidence that cytogenetic alterations t(11;14) and t(4;14) are significantly (p-value <0.05) associated with differing venetoclax response in MM patients. These findings demonstrated that gene expression biomarkers and chromosomal translocations play a significant role in regulating venetoclax drug response in MM, which can be further utilized to personalize treatments for patients. The knowledge obtained from this work best applies in personalized medicine; whereby fitting treatments to an individual patient’s genomic landscape will enhance patient outcome.
  • Leppiniemi, Samuel Albert (2023)
    High-grade serous carcinoma (HGSC) is a highly lethal cancer type characterised by high genomic instability and frequent copy number alterations. This study examines the relationships between genetic variants in tumour germline and gene expression levels to obtain a better understanding of gene regulation in HGSC. This would then improve knowledge of the cancer mechanisms in order to find, for example, potential new treatment targets and biomarkers. The aim is to find significantly associated variant-gene pairs in HGSC. Expression quantitative trait loci (eQTL) analysis is a well-suited method to explore these associations. eQTL analysis is a suitable approach to analysing also those variants that are located in the non-coding genomic regions, as indicated by previous genome-wide association studies to contain many disease-linked germline variants. The current eQTL analysis methods are, however, not applicable for association testing between genes and variants in the context of HGSC because of the special genomic features of the cancer. Therefore, a new eQTL analysis approach, SegmentQTL, was developed for this study to accommodate the copy-number-driven nature of the disease. Careful input processing is of particular importance in eQTL as it has a notable effect on the number of significantly associated variant-gene pairs. It is also relevant to maintain adequate statistical power, which affects the reliability of the findings. In all, this study uses eQTL analysis to uncover variant-gene associations. This helps to improve knowledge of gene regulation mechanisms in HGSC in order to find new treatments. To apply the analysis to the HGSC data, a novel eQTL analysis method was developed. Additionally, appropriate input processing is important prior to running the analysis to ensure reliable results.
  • Myllynen, Mikko (2013)
    Epithelial tissue is characterized by close cell-cell and cell-extracellular matrix (ECM) contacts as well as by apico-basal polarization. Integrity of these two features is important for functionality of epithelium. Additionally, proteins regulating polarity and cell junctions have been linked to cell cycle and apoptosis control. Consequently, defects in many of the polarity proteins have been linked to oncogenic events and loss of polarity is a hallmark of advanced cancers but whether it is causal to tumorigenesis is yet unknown. However, large body of knowledge on apico-basal polarity regulation and its connection on homeostasis control is derived from studies in Drosophila. This is mainly due to fact that efficient high throughput organotypic three dimensional (3D) culture methods enabling apico-basal polarization have not been available until the last decade. Large screens for epithelial polarity regulators have not been carried out in mammalian cells. Moreover, as cancer is the leading cause of death in developed countries and most of the cancers originate from epithelial tissues, knowledge of polarity regulation can be medically relevant. Oncogene MYC is overexpressed or amplified in variety of human cancers. The tumorigenic function of MYC is mainly due to its ability to drive cell cycle. We have previously shown that intact epithelial architecture is protective from cell cycle deregulating activities of MYC in 3D MCF10A mammary epithelial cell model and in vivo. This resistance can be overcome by inactivating LKB1 which is the human homologue of the polarity protein PAR4 implying a tumour suppressive role for epithelial architecture in mammalian cells. To identify regulators of epithelial architecture in mammalian cells, we have established lentiviral shRNA library (human epithelial architecture library, hEAL) encompassing 219 constructs targeting 77 genes associated with polarity regulation in Drosophila. We have previously screened the shRNA constructs for downregulation and quantified their effects on acinar morphology in the MCF10A 3D model. In this Master's Thesis I have validated the downregulation and phenotypes observed in a subset of the shRNAs during primary screening of the constructs of the library. Additionally, the possible co-operation with downregulation of the polarity regulators and conditional activation of MYC was determined. Most dramatically, downregulation of Wnt pathway gene DVL3 was shown to cause formation of enlarged multiacinar structures, which have increased proliferation. Additionally, downregulation of another Wnt pathway gene, GSK3β, resulted in acini with increased size and filled lumens. Thus these results propose a role for these genes in epithelial architecture regulation and tumour suppression in the used model even though apico-basal polarization of the acini was intact and no synergy with MYC was observed. Interestingly, no role in epithelial architecture regulation for Hippo pathway related genes FAT4 and MOBKL1A was found. Importantly, this study was able to validate primary screen showing relevance of the pipeline. Lastly, the study characterized the synthetic lethality phenotype found in the primary screen caused by downregulation of GTPase RHOA and chronic MYC activation. The shRHOA acini exhibited perturbed α6-integrin localization. When combined with MYC activation, the percentage of apoptotic acini was significantly increased. Importantly, the results suggest the observed synthetic lethality to be specific for the 3D context and to be associated with MEK/ERK and ROCK pathways. Taken together, in this study I have validated the role of novel epithelial architecture regulators and candidate tumour suppressors in MCF10A cells which may have medical relevance by helping to characterize tumorigenic processes. Furthermore, I characterized a novel 3D specific RHOA-MYC synthetic lethal interaction, which may prove to have therapeutic significance in MYC-driven cancers in future.