Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Title

Sort by: Order: Results:

  • Pöntinen, Mikko (2018)
    One of the main factors currently limiting geophysical and geological studies of asteroids is the lack of visual and near-infrared (Vis-NIR) spectra. European Space Agency’s upcoming Euclid mission will observe up to 150,000 asteroids and gather a large amount of spectral data of them in the Vis-NIR wavelength range. Asteroids will appear as faint streaks in the images. In order to exploit the spectra, the asteroids have to first be found in the massive amounts of data to be obtained by Euclid. In this work we tested two methods for detecting asteroid streaks in simulated Euclid images. The first method is StreakDet, a software originally developed to detect streaks caused by space debris. We optimized the parameters of StreakDet, and developed a comprehensive analysis software that can visualize and give statistics of the StreakDet results. StreakDet was tested by feeding 4096×4136 pixel images to the software, which then returned the coordinates of the asteroids found. The second method is machine learning. We programmed a deep neural network, which was then trained to distinguish between asteroid images and non-asteroid images. Smaller images were used for this binary classification task, but we also developed a sliding window method for analyzing larger images with the neural network. After optimizing the program parameters, StreakDet was able to detect approximately 60% of asteroids with apparent magnitude V < 22.5. StreakDet worked better for long streaks, up to 125 pixels (corresponding to an asteroid with a sky motion of 80 "/h) while streaks shorter than 15 pixels (10 "/h) were typically not found. The neural network was able to classify the brightest (20 < V < 21) streaks with up to 98% accuracy when using very small images. When analyzing larger images, the sliding window algorithm produced heat maps as output, from which the asteroids could easily be spotted. The machine learning algorithm utilized was fairly simple, so even better results may be obtained with more advanced algorithms.
  • Mohanraj, Ushanandini (2016)
    The rapid emergence of antibiotic resistance among many pathogenic bacteria has created a profound need to discover new alternatives to antibiotics. Bacteriophages are viruses which infect bacteria and are able to produce special proteins involved in bacterial lysis. However, for many bacteriophage-encoded gene products, the function is not known, i.e., hypothetical proteins of unknown function (HPUFs). Screening these proteins likely identifies a rich source of leads that will help in the development of novel antibacterial compounds. The current study presents two phage genomics-based screening approaches to identify phage HPUFs with antibacterial activity. Both screening assays are based on inhibition of bacterial growth when a toxic gene is expression cloned into a plasmid vector. The first approach was a luxAB/luxCDE -based luminescence screening assay. The luxCDE genes encoding the luciferase substrate producing enzymes were integrated into an Escherichia coli strain genome as a transcriptional fusion. Also, a vector carrying the luxAB genes, encoding the luciferase enzyme, and a cloning site for the phage HPUF genes, was constructed. Ligation of a toxic gene into the vector would result in few or rare transformants after electroporation while ligation of a non-toxic gene would result in large number of transformants, and the difference in number of transformants will be reflected in the amount of bioluminescence after electroporation. The proof of concept of the approach was verified using the control genes g150 (a structural, thus a non-toxic gene of phage R1-RT) and regB (a known toxic gene of phage T4). The results demonstrated a significant difference in Relative Luminescence Units (RLU) between the g150 and regB electroporation mixtures. The second screening approach was an optimized plating assay producing a significant difference in the number of transformants after ligation of the toxic and non-toxic genes into a cloning vector. This assay was tested and optimized with several known control toxic and non-toxic genes. Using the plating assay approach, in the current study, ninety-four R1-RT HPUFs were screened and ten of them showed toxicity in E. coli. In future, the identified toxic HPUFs of R1-RT could be purified and characterized to identify their bacterial targets. Further, both of these screening assays can be used to screen among HPUFs of other phages, and this should allow the discovery of a wide variety of putative inhibitors for the control of current and emerging bacterial pathogens.
  • Varis, Vera (2020)
    Protein kinases are signaling molecules that regulate vital cellular and biological processes by phosphorylating cellular proteins. Kinases are linked to variety of diseases such as cancer, immune deficiencies and degenerative diseases. This thesis work aimed to identify direct substrates for protein kinases in the CMGC family, which consists of the cyclin-depended kinases (CDK), mitogen activated protein kinases (MAPK), glycogen synthase kinase-3 (GSK3) and CDC-like kinases (CLK). CMGC kinases have been identified as cancer hubs in interactome studies, but large-scale identification of direct substrates has been difficult due to the lack of efficient methods. Here, we present a heavy-labeled 18O-ATP-based kinase assay combined with LC-MS/MS analysis for direct substrate identification. In the assay, HEK and HeLa cell lysates are treated with a pan-kinase inhibitor FSBA which irreversibly blocks endogenous kinases. After the removal of FSBA, cell lysates are incubated with the kinase of interest and a heavy-labeled ATP, which contains 18O isotope at the γ-phosphate position. Resulting phosphopeptides are enriched with Ti4+- IMAC before the LC-MS/MS analysis, which distinguishes the desired phosphorylation events based on a mass shift caused by the heavy 18O. With this pipeline of methods, we managed to quantify and identify direct substrates for 26 members of CMGC kinase family. A total of 1345 substrates and 3841 interacting kinase-substrate pairs were identified in cytosolic cell lysates, from which 165 were annotated in the PhosphoSitePlus® database. To identify substrates for kinases with nuclear localization, ten kinases were tested with nuclear HEK cell lysate. We identified 194 kinase-substrate pairs, 141 of which were unique to the nuclear fraction and 27 annotated in the PhosphoSitePlus® database. Finally, kinases with outstandingly high amounts of novel substrates were subjected to gene ontology analysis. We were able to link the gene ontology classifications of novel substrates to the biological processes regulated by the kinase of interest. These results indicate that heavy-labeled 18O-ATP-based kinase assay linked LC-MS/MS is a useful tool for large-scale direct kinase substrate identification.
  • Pöyhönen, Rosanna (2013)
    Charcot-Marie-Tooth (CMT) neuropathy is one of the most common forms of inherited peripheral neuropathies with the prevalence of one in 2500 individuals. CMT is phenotypically and genetically a very heterogeneous disease. It can be inherited as an autosomal recessive, dominant or X-linked trait. CMT is characterized by distal muscle weakness, atrophy and deformity of the feet as well as clumsiness of gait. The onset of CMT varies and also the symptoms of the disease can vary even among the members of a single family. So far more than 40 genes have been identified for CMT and the list is estimated to grow by 30-50 genes. Whole exome sequencing (WES) is a next generation sequencing technique, which targets the protein coding area of the genome. Through WES analysis it is possible to search for disease causing mutations with all kinds of inheritance patterns. Patients suffering from CMT are good candidates for WES analysis because of the genetic heterogeneity of their disease. WES can be used for diagnosing Mendelian disorders with atypical symptoms as well as diseases, which are difficult to confirm using clinical criteria alone and which require costly evaluation, e.g. CMT. In this master study new disease causing mutations for early-onset neuropathies are identified by whole exome sequencing. The aims of this study include using WES for the molecular diagnosis of four patients suffering from early-onset axonal neuropathies, the functional analysis of possible causative variants and improving and developing the process of analyzing variants from whole exome sequencing data, especially the analyzing steps of insertion and deletion variants. Finding causative variants among the insertion and deletion variants has previously been often left out from the WES analysis because of the lack of systematic analysis technique. As a result of the WES data analysis a new candidate disease gene, tripartite motif containing 2 (TRIM2) was identified. A missense mutation c.761T>A (p.E254V) and a deletion c.1779delA (p.K594Rfs7X) were found in patient 2, who suffers from severe CMT type 2. The carrier frequency was analysed to see whether the variants are present in the general population or not. The functional analysis of TRIM2 was started by preparing constructs carrying the missense mutation and the deletion and by setting up conditions for western blotting.
  • Pöyhönen, Julia Rosanna Hellin (2013)
    Charcot-Marie-Tooth (CMT) neuropathy is phenotypically and genetically a very heterogeneous disease. It can be inherited as an autosomal recessive, dominant or X-linked trait. CMT is characterized by distal muscle weakness, atrophy and deformity of the feet as well as clumsiness of gait. The onset of CMT varies and also the symptoms of the disease can vary even among the members of a single family. So far more than 40 genes have been identified for CMT and the list is estimated to grow by 30-50 genes. Whole exome sequencing (WES) is a new next generation sequencing technique, which targets the protein-coding area of the genome. Through WES analysis it is possible to search for disease causing mutations with all kinds of inheritance patterns. Patients suffering from CMT are good candidates for WES analysis because of the genetic heterogeneity of their disease. WES can be used for diagnosing Mendelian disorders with atypical symptoms as well as diseases, which are difficult to confirm using clinical criteria alone and which require costly evaluation, e.g. CMT. In this master study new disease causing mutations for early-onset neuropathies are identified by whole exome sequencing (WES). The aims of this study include using WES for the molecular diagnosis of four patients suffering from early-onset axonal neuropathies, the functional analysis of possible causative variants and improving and developing the process of analyzing variants from whole exome sequencing data, especially the analyzing steps of insertion and deletion variants. Finding causative variants among the insertion and deletion variants has previously been often left out from the WES analysis because of the lack of systematic analysis technique. As a result of the WES data analysis a new candidate disease gene, tripartite motif containing 2 (TRIM2) was identified. A missense mutation c.761T>A (p.E254V) and a deletion c.1779delA (p.K594Rfs7X) were found in patient 2, who suffers from severe CMT type 2. The carrier frequency was analysed to see whether the variants are present in the general population or not. The functional analysis of TRIM2 was started by preparing constructs carrying the missense mutation and the deletion and by setting up conditions for western blotting.
  • Toukola, Peppi (2021)
    In this thesis the suitability of Nuclear Magnetic Resonance (NMR) spectroscopy in the identification of rubbers in museum collections is discussed through a literature review and experimental work where samples from the rubber collection of Tampere Museums were analysed with different NMR techniques. The literature part of this thesis focuses on recent (2011-2020) scientific publications on analytical instrumental techniques used in the identification of cultural heritage plastics. Vibrational spectroscopy methods utilizing hand-held or portable devices have been the most prominent methods used in characterization of historical plastics materials. Bench-top devices and analytical techniques requiring sampling were used to acquire more detailed analysis results. However, NMR spectroscopy was not used as the main analysis technique in the reviewed publications. In the experimental part altogether 21 rubber object samples and 8 reference samples were identified using 1D and 2D NMR techniques in solution state. Three samples were additionally analysed with solid-state High Resolution Magic Angle Spinning (HRMAS) NMR spectroscopy. The chemical structures of the samples were confirmed with these methods. To further explore fast and more automated identification of the rubber samples a statistical classification model utilizing acquired solution-state 1H NMR data was developed. Three rubber types were chosen for the analysis. The model was created using analysis data from the museum object samples and validated using the reference sample data. Identification rate of 100 % was achieved.
  • Varvarà, Giulia (2022)
    Species factories are defined as times and places in the fossil record where and when an exceptionally large number of new species occurs. While several tailored solutions for the mammalian record have been proposed, how to identify species factories computationally in a standardized way is still an open question. To quantify what is exceptional, we first need to quantify what is regular. One of the main challenges in this identification process is to account for sampling unevenness, which depends on several methodological decisions, including the scale of the analysis (aggrega- tion radius). In this thesis we used Capture-Mark-Recapture methods (CMR) with spatial aggregation guided by network modelling, to estimate the sampling probabilities for the species in the NOW database of mammalian fossil occurrences. Since the mammalian record is sparse and most localities include only a few species, we coupled CMR with tailored spatial aggregation approaches to estimate the sampling prob- abilities. We then used these sampling probabilities to quantify background speciation rates and assess what rates are abnormal. We represented aggregated fossil data as a bipartite network and used community detection to evaluate how the choice of an aggre- gation radius impacts the modular structure. After aggregating the data according to the radius chosen using networks analysis, we es- timated sampling probabilities using CMR. These probabilities allow the adjustment for sampling unevenness so that the difference in findings can be compared across locations and cannot be due to differences in sampling. We identified as species factories the locations with origination rate in the highest 5% after adjustment per time unit. Once the species factories had been identified, we looked for paleoecological patterns in these places that may be lacking elsewhere, finding that species factories present a lower number of findings and of different species among findings, but a higher ratio between the amount of different species and of total findings than the rest of the locations. This would indicate that, even if species factories might accommodate fewer species, they present a higher diversity. To make sure these results were not only due to chance, we performed the same analysis on 100 randomized experiments obtained using a modified version of the Curveball Algo- rithm and compared the values obtained from the original dataset and the ones obtained from the randomized ones. This comparison showed us that species factories tend to have more extreme values than the ones obtained through randomization, which would indicate that species factories present specific paleoecological patterns that are not present in other locations.
  • Jonkka, Susanna (2016)
    Ovarian cancer is known as "the silent killer" because it is generally diagnosed at a late stage, and is therefore responsible for more deaths than any other gynecological malignancy. Although the genetic background of high-grade serous carcinoma (HGSC) is highly heterogeneous, almost all HGSCs harbor TP53 mutations, and mutations in BRCA1 and BRCA2 are also frequent. Less is known about the chromosomal rearrangements that function as drivers of HGSC. The aim of this thesis project was to identify and validate novel and recurrent chromosomal rearrangements that may have a functional relevance in the tumorigenesis of high-grade serous carcinoma. We searched for recurrent rearrangements detected by a computational algorithm (BreakDancer) in 44 HGSC whole-genome sequences that were obtained from The Cancer Genome Atlas database. We identified five samples to harbor a novel region that was affected by recurrent deletions of similar size. This region was located upstream of the gene TUBB4A on chromosome 19. We used PCR to screen for rearrangements within this region in 11 Finnish patient tumor tissues. None of these samples displayed rearrangements within this region. Further studies with larger sample sizes are required to validate whether this region indeed is recurrently affected by chromosomal abnormalities. Identifying chromosomal rearrangements of functional relevance will pave the way towards the use of personalized medicine.
  • Zhang, Teng (2015)
    The architecture of inflorescence refers to the spatio-temporal arrangement of flowers on the reproductive branches. Flowering plants have evolved great diversity in such branching systems. Among which, the showy capitulum type inflorescence in the large Compositae (Asteraceae) species is regarded as a prerequisite factor for their wide spreading around the world. Different from the simple raceme and cyme, capitulum compresses hundreds of individual florets on its receptacle, but overall resembles a single, solitary flower. The ontogeny of capitulum also bears resemblance to a single flower, with regard to the meristem determinacy, floral sequence and histological configurations. Recent molecular studies have revealed that a plant specific transcription factor LEAFY (LFY), is required for both the floral initiation and floral patterning, the two essential steps to form an inflorescence. The thesis elaborates Gerbera hybrida as a model to elucidate functions of the LFY ortholog during the development of inflorescence/flower in a capitulum background. In addition to the conserved functions in regulating floral meristem identity and floral patterning, three specified functions were revealed by transgenic Gerbera with down-regulated expression of GhLFY. Firstly, GhLFY is involved in the regulating the floral initiation of marginal ray florets. Down regulation of GhLFY resulted the marginal ray florets revert into a branching patterm that shown on the capitulum of Calyceraceae, the close relatives of Asteraceae. Secondly, the determinacy of IM is disrupted when GhLFY loses its functions, suggesting that GhLFY may function at both the flower and inflorescence interfaces. Thirdly, different flower types show specific responses to GhLFY down-regulation in floral patterning, indicating that there exist a potential genetic gradient among different flower types. At protein level, the LFY functions are specified by formation of versatile protein complexes with its transcriptional co-regulators. In Gerbera, GhLFY proteins tend to form homodimers and they were also capable to interact with a conserved transcriptional co-regulator, the UNUSUAL FLORAL ORGANS (UFO) ortholog GhUFO. Taken advantage of the forward Y2H library screening, 6 additional proteins were identified to interact with GhLFY, including several novel potential co-regulators of LFY that has not yet been identified in other species. Additionally, a bimolecular fluorescence complementation assay (BiFC) was optimized to verify the GhLFY self-interaction in planta.
  • Ristolainen, Heikki; Kilpivaara, Outi; Kamper, Peter; Taskinen, Minna; Saarinen, Silva; Leppä, Sirpa; d'Amore, Francesco; Aaltonen, Lauri A. (2015)
    Tutkimuksessamme tarkastelimme Lähi-idästä lähtöisin olevaa perhettä, jossa kolmella viidestä lapsesta on todettu nuorellä iällä klassinen Hodgkinin lymfooma (cHL). Perinnöllinen alttius cHL:lle tunnetaan huonosti, eikä taudille mahdollisesti altistavia geenimuutoksia ole aiemmin raportoitui kuin yksi kappale. Geenimuutosten selvittämiseksi eksomisekvensoimme kolmen sairastuneen lapsen verinäytteestä eristetyn DNA:n ja poimimme joukosta kaikkien kolmen jakamat muutokset. Suodatimme lasten jakamien DNA-muutosten joukosta pois omissa vertailujoukoissamme ja useissa julkisissa tietokannoissa esiintyvät geneettiset muutokset ja arvioimme jäljellejääneiden muutosten haitallisuutta kahdella laskennallisella priorisaatioalgoritmilla. Näin saimme järjestettyä jäljelle jääneet 35 jaettua muutosta laskennalliseen haitallisuusjärjestykseen. Jaetuista muutoksista merkittävimmäksi nousi ACAN-geenissä oleva homotsygoottinen 57 emäksen pituinen deleetio c.2836_2892del, jota ei ole aiemmin liitytty cHL-fenotyyppiin.
  • Siskovs, Klims (2021)
    STK11/LKB1 is a tumor suppressor gene and mutated in 18% of lung adenocarcinomas. Tumor suppressor liver kinase B1 (LKB1) is known to activate adenosine monophosphate-activated protein kinase (AMPK) and 12 AMPK-related kinases (ARKs) by phosphorylating a conserved threonine residue in their T-loop region. A number of studies focused on investigating the influence of LKB1-AMPK signaling on cancer cell proliferation. However, there is no systematic study for identifying the critical LKB1 kinase substrates in suppressing lung cancer cell growth. In this project, the LKB1-deficient lung adenocarcinoma cell line A549 cells were sequentially overexpressed with constitutively active mutants of AMPKα1, AMPKα2, MARK1, MARK2, MARK3, MARK4, NUAK1, NUAK2, SIK1, SIK2, SIK3. The overexpression status was confirmed at both genetic and protein levels by qPCR and Western blotting, correspondingly. In vitro growth assays demonstrated up to 33% reduced growth rate of A549 cells overexpressing AMPKα1, AMPKα2 and NUAK1. Furthermore, siRNA knockdown of the selected substrates in LKB1-overexpressing A549 cells significantly rescued the cell growth defect. These findings suggest, that AMPKα1, AMPKα2 and NUAK1 kinases are critical for LKB1-mediated cell growth defect in lung adenocarcinoma.
  • Dovydas, Kičiatovas (2021)
    Cancer cells accumulate somatic mutations in their DNA throughout their lifetime. The advances in cancer prevention and treatment methods call for a deeper understanding of carcinogenesis on the genetic sequence level. Mutational signatures present a novel and promising way to capture somatic mutation patterns and define their causes, allowing to summarize the mutational landscape of cancer as a combination of distinct mutagenic processes acting with different levels of strength. While the majority of previous studies assume an additive relationship between the mutational processes, this Master’s thesis provides tentative evidence that contemporary methods with additivity constraints, e.g. non-negative matrix factorization (NMF), are not sufficient to comprehensively explain the observed mutations in cancer genomes and the observed deviations are not random. To quantify these residues, two metrics are defined – additive and multiplicative residues – and hierarchical clustering algorithms are used to identify cancer subsets with similar residual profiles. It is shown that in certain cancer sample subsets there is a systematic mutational burden overestimation that can only be solved by a multiplicatively acting process, as well as non-random underestimation, requiring additional mutational signatures. Here an extension to the additive mutational signature model is proposed – a probabilistic model that incorporates a selectively active modulatory mutational process that is able to act in a multiplicative manner together with the known mutational signatures, reducing systematic variability.
  • Domènech Moreno, Eva (2017)
    In this Master’s project, I have studied a mammalian serine-threonine kinase NUAK2 implicated in human disease but whose molecular functions and interacting proteins are as of yet poorly characterized. The goal was to identify new interacting proteins to increase understanding of the molecular functions and potentially link to human physiology and disease. Recent work from the host lab shows NUAK2 loss in cultured primary cells mimics loss of the tumor suppressor LKB1 which also acts upstream of NUAK2, together suggesting NUAK2 could be involved in tumor suppression. Currently, only two protein-protein interacting proteins with NUAK2 have been identified: NUAK2 is targeted to actin stress fibers by the myosin phosphatase Rho-interacting protein (MRIP), and it is involved in regulating cell contractility by affecting indirectly the phosphorylation cycle of the myosin light chain through inactivation of the myosin phosphatase target subunit 1 (MYPT1). In this project, I utilized a novel protein-protein interaction screening method that utilizes proximity-dependent biotin labeling to identify new interacting proteins with NUAK2 in human embryonic kidney cells (HEK 293). This method is based on fusing an E.Coli promiscuous biotin ligase, BirA*(R118G), to the investigated protein. The BirA*(R118G) ligase biotinylates all the proteins in close proximity of the fusion protein creating a history of protein-protein associations over time. Afterwards, the biotinylated proteins can be isolated by affinity purification methods and identified by mass-spectrometry. The screening identified the previously known interaction partners of NUAK2 indicating it was technically successful. In addition, I also identified in total 108 novel potential protein interaction partners for NUAK2. One of the top hits was Cytospin-A, a cross-linking protein between microtubules and actin cytoskeleton, supporting a role of NUAK2 as regulator of cytoskeleton. Supporting the validity of our finding, Cytospin-A depletion in mammalian cells causes defective actin-cytoskeleton reorganization, a very similar phenotype seen with NUAK2 depletion. In future studies, I will continue to investigate the specific role of NUAK2 and Cytospin-A aiming for detailed information on the function of NUAK2 in regulation of microtubules and actin cytoskeleton. Validation of some of the other identified interactions is expected to provide novel insights to the biology and role of NUAK2 in LKB1 tumor suppressor functions.
  • Huusari, Anna (2018)
    Plants control the exchange of gases through the stomatal pores. Stomata are formed by guard cells and the closure of stomata are regulated via a complex signaling network in response to various biotic and abiotic stimuli, such as pathogens, elevated levels of CO2 and darkness. The leucine-rich repeat receptor-like kinase (LRR-RLK) GUARD CELL HYDROGEN PEROXIDE-RESISTANT1 (GHR1) is part of the network regulating stomatal closure. GHR1 is an inaktive pseudokinase that can activate SLOW ANION CHANNEL-ASSOCIATED1 (SLAC1), an anion channel that is crucial for stomatal closure, via interacting proteins. The exact role of GHR1 is still partly unknown, however, it has been suggested that GHR1 could function as a scaffold or as an allosteric regulator of additional components required for stomatal closure. The aim of this study was to identify novel interactors of GHR1. First stable plant lines expressing fusion proteins GHR1-YFP, GHR1W799*-YFP and plain YFP as a negative control were generated and from these lines fusion protein expression levels and the subcellular localization were studied. Next the plant lines were used for purifying GHR1 interacting proteins with the use of co-immunoprecipitation and identification of the proteins with mass spectrometry. The unlikely GHR1 interactor candidates were then filtered from the mass spectrometry data. The subcellular localization and the protein expression of the interacting proteins were studied with the use of internet databases. Literature of the GHR1 interacting proteins were studied in order to make possible connections with GHR1 and stomatal closure. In this study 38 GHR1 interactors were identified. Literature search revealed that many of the identified interactors had a known role in stomatal movements. These included proteins such as PLASMA MEMBRANE INTRINSIC PROTEIN2-1 (PIP2-1) and BETA CARBONIC ANHYDRASE 4 (BCA4), that are known to have a role in stomatal closure. Future work includes confirming the interactions with independent methods and studying the molecular mechanisms related to stomatal movements. The GHR1 interactome identified here for the first time reveals novel parts of the network regulating stomatal movements and thus increases our understanding of molecular mechanisms behind stomatal functions.
  • Doagu, Fatma (2013)
    Intellectual disability (ID) is a clinically diverse and genetically heterogeneous disorder characterized by central nervous system defects of varying severity resulting in substantial impairment of intellectual and adaptive functioning as expressed in conceptual (IQ<70), social and practical adaptive skills diagnosed before 18 years of age. The condition is referred to as non-syndromic when ID is the only clinical feature and syndromic when ID is accompanied by specific other features, for example, Down syndrome. Intellectual disability is one of the largest unsolved problems of health care with a prevalence of 2-3% in the population. There is a 30-40% excess of male versus female patients in ID which refers to over-representation of X chromosomal defects causing ID. In this study, exome sequencing of the X chromosome was applied in order to identify genes and their mutations in two Finnish families with intellectual disability of unknown cause. The mutations were identified using Agilent Sure select array that covers almost 93% of the coding region of the chromosome. Exome sequencing resulted in 11 variations in total. Segregation of these variants was studied using PCR, ExoSAP-IT purification protocol and BigDye® Terminator v3.1 Cycle Sequencing Kit. Eventually, two novel mutations were identified: one for each family. Both mutations reside in genes that have previously been shown to cause X-linked intellectual disability. Both of the mutations were absent in over 120 control DNA samples. In one family with three affected males, a novel splice mutation was identified in discs large homolog 3 (DLG3), which encodes synapse-associated protein 102 (SAP102). The mutation is located at the splice site in intron 1 (500+1 G>C) and its effect on protein function needs to be analyzed at the RNA-level using cDNA-sequencing. The clinical phenotype of the three affected brothers is mild to moderate intellectual disability. In the other family with three severely affected male patients, a novel mutation in exon 12 was identified on glutamate receptor, ionotropic, AMPA 3 (GRIA3) resulting in amino acid glycine (GGG) changing to arginine (CGG) at codon 630 (G630R). GRIA3 belongs to AMPA receptors implicated in the regulation of several biological processes. Our findings elucidate the power of exome sequencing in the diagnosis of rare, genetically heterogeneous disorders like intellectual disability. The results obtained will help in assessing the prognosis of the disease, in estimating the risk of the disorder to other family members, and in facilitating the development of future therapies for these devastating disorders. The results also further confirm the role of DLG3 and GRIA3 in human cognitive development.
  • Adunola, Paul Motunrayo (2021)
    Lipoxygenase enzymes, which contribute significantly to storage protein in legume seeds have been reported to cause the emission of volatile compounds associated with the generation of off-flavours. This is an are important factor limiting the acceptance of faba bean (Vicia faba) I foods. This study aimed at using bioinformatic tools to identify seed-borne lipoxygenase (LOX) genes and to design a biological tool using molecular techniques to find changes in sequence in faba bean lines. LOX gene mining by Exonerate sequence comparison on the whole genome sequence of faba bean was used to identify six LOX genes containing Polycystin-1, Lipoxygenase, Alpha-Toxin (PLAT) and/or LH2 LOX domains. Their sequence properties, evolutionary relationships, important conserved LOX motifs and subcellular location were analysed. The LOX gene proteins identified contained 272 – 853 amino acids (aa). The molecular weight ranged from 23.67 kDa in Gene 6 to 96.45 kDA in Gene 1. All the proteins had isoelectric points in the acidic range except Genes 6 and 7 which were alkaline. Only one gene had both LOX conserved domains with aa sequence length similar with that found in soybean and pea LOX genes and isoelectric properties with soybean LOX3. Phylogenetic analysis indicated that the genes were clustered into 9S LOX and 13S LOX types alongside other seed LOX genes in some legumes. Five motifs were found, and sequence analysis showed that three genes (Gene 1, 2 and 3) contained the 38-aa residue motif that includes five histidine residues [His-(X)4-His-(X)4-His-(X)17-His-(X)8-His]. The subcellular localization of the lipoxygenase proteins was predicted to be primarily the cytoplasm and chloroplast. Primers covering ~1.2 kb were designed, based on the conserved region of Genes 1, 2 and 3 nucleotide sequences. Gel electrophoresis showed the PCR amplification of the seed LOX gene at the expected region for twelve faba bean lines. Phylogenetic analysis showed evolutionary divergence among faba bean lines for sequenced and amplified region of their respective seed LOX alleles.
  • Nyhamar, Ellisiv (2022)
    S. aureus infections are prominent worldwide, and with the rapid increase in antimicrobial resistant variants such as methicillin-resistant MRSA, the need for new treatment alternatives is imminent (Monaco et al., 2017). Lytic bacteriophages are continually evolving new methods for the destruction of bacterial cells while avoiding their defence mechanisms. Screening hypothetical proteins of unknown function (HPUFs) from bacteriophages for toxic activity against bacteria may provide new and potentially life-saving approaches to combat bacterial infections (Liu et al., 2004, Singh et al., 2019). The Stab21 phage of Staphylococcus is a recently described lytic phage with over 85 % of its open reading frames annotated as HPUFs (Oduor et al., 2019). The successful identification of potentially toxic gene products could facilitate the discovery of novel bacterial targets for the development of new antimicrobials. It could also provide treatment options to multi-drug resistant S. aureus caused infections where no effective drugs are currently available. To reduce unnecessary screening of phage particle associated yet poorly annotated proteins, total proteins of phage particle were previously identified by LC-MS. Similar studies have previously been performed with Yersinia phage fR1-RT and Klebsiella phage fHe-Kpn01, where a handful of toxic proteins were discovered (Mohanraj et al., 2019, Spruit et al., 2020). To accelerate the screening process, a next-generation sequencing (NGS) high-throughput screening method was further developed by Kasurinen et al. (2021). In this study, 96 true HPUFs were selected and screened for their bactericidal activity in E. coli using the NGS-based approach. Fourteen potentially bacteriotoxic Stab-21 gene products were identified through toxicity screening in E. coli. Of these, three had a particularly low ratio of isolated plasmid after transformation while having a significant number of reads over each joint sequence, indicating their potentially high toxicity. The three most promising candidates were the gene products of g008, g081c and g175 of the Stab21 bacteriophage.
  • Andsten, Rose-Marie (2020)
    Bacteria are a great source of natural products with complex chemical structures and diverse biological activities. Many have therapeutic properties and half of drugs in clinical use today are derived directly or indirectly from natural products. The pharmaceutical industry stopped investing in drug development from natural resources, due to perceived limitations in chemical space, and difficulties in rediscovery of known compounds and in obtaining sufficient quantities of natural products for clinical trials. There is now renewed interest in natural products as drug leads driven by technological advances in genome sequencing and analytical chemistry. Cyanobacteria produce a variety of natural products with therapeutic potential. Muscoride A is an unusual peptide alkaloid produced by a terrestrial freshwater cyanobacterium with reported antimicrobial activity. The aim of this study was to characterize the biosynthetic origin and biological activity of muscoride A. I identified the 12.7 kb muscoride (mus) biosynthetic gene cluster from a draft genome of Nostoc sp. PCC 7906 using bioinformatics analysis. The mus biosynthetic gene cluster encoded enzymes for the heterocyclization, oxidation and prenylation of a precursor protein. Comparative genomics identified a mus biosynthetic gene cluster in the unpublished draft genome of Nostoc sp. UHCC sp. 0398 encoding a novel muscoride. This novel muscoride, muscoride B, was detected from Nostoc sp. UHCC 0398 based on this analysis. Muscoride B was purified using solid phase extraction and high-performance liquid chromatography and the chemical structure was verified by combining nuclear magnetic resonance and mass spectrometry data. Furthermore, the function and evolutionary history of the muscoride prenyltransferases were studied. A significant finding was that the biosynthetic pathway encodes two regiospecific prenyltransferases, catalyzing the C- and N-terminal prenylation of muscoride. An antimicrobial activity screening showed that muscoride B had antimicrobial activity against Bacillus cereus. Here I report the discovery of the muscoride biosynthetic pathway and the discovery of a novel antimicrobial peptide from cyanobacteria through genome mining. The results show that the variant is a novel muscoride, a linear bis-prenylated polyoxazole pentapeptide with antimicrobial activity.
  • Rautiainen, Mikko (2016)
    The genomes of all animals, plants and fungi are organized into chromosomes, which contain a sequence of the four nucleotides A, T, C and G. Chromosomes are further arranged into homologous groups, where two or more chromosomes are almost exact copies of each others. Species whose homologous groups contain pairs of chromosomes, such as humans, are called diploid. Species with more than two chromosomes in a homologous group are called polyploid. DNA sequencing technologies do not read an entire chromosome from end to end. Instead, the results of DNA sequencing are small sequences called reads or fragments. Due to the difficulty of assembling the full genome from reads, a reference genome is not always available for a species. For this reason, reference-free algorithms which do not use a reference genome are useful for poorly understood genomes. A common variation between the chromosomes in a homologous group is the single nucleotide polymorhpism (SNP), where the sequences differ by exactly one nucleotide at a location. Genomes are sometimes represented as a consensus sequence and a list of SNPs, without information about which variants of a SNP belong in which chromosome. This discards useful information about the genome. Identification of variant compositions aims to correct this. A variant composition is an assignment of the variants in a SNP to the chromosomes. Identification of variant compositions is closely related to haplotype assembly, which aims to solve the sequences of an organism's chromosomes, and variant detection, which aims to solve the sequences of a population of bacterial strains and their frequencies in the population. This thesis extends an existing exact algorithm for haplotype assembly of diploid species (Patterson et al, 2014) to the reference-free, polyploid case. Since haplotype assembly is NP-hard, the algorithm's time complexity is exponential to the maximum coverage of the input. Coverage means the number of reads which cover a position in the genome. Lowering the coverage of the input is necessary. Since the algorithm does not use a reference genome, the reads must be ordered in some other way. Ordering reads is an NP-hard problem and the technique of matrix banding (Junttila, PhD thesis, 2011) is used to approxiately order the reads to lower coverage. Some heuristics are also presented for merging reads. Experiments with simulated data show that the algorithm's accuracy is promising. The source code of the implementation and scripts for running the experiments are available online at https://github.com/maickrau/haplotyper.
  • Roininen, Aino Elina Sylvia (2018)
    Raspberry is prone to virus infections but diversity and occurrence of different raspberry viruses in Finland is still largely unknown. The purpose of this thesis work was to reveal which viruses are troubling raspberry varieties that are part of Finnish raspberry genetic resources and have been maintained in vivo on the field. The study also examines and compares siRNA diagnostics to traditional PCR method in detection of raspberry viruses. PCR, cloning, and traditional Sanger sequencing are used to get more detailed information of the virus strains, and scratch the surface of phylogenetic diversity of raspberry viruses that were detected in this study. siRNA detection of was accurate and effective with Black raspberry necrosis virus (BRNV), Raspberry bushy dwarf virus (RBDV) and Rubus yellow net virus (RYNV), but VirusDetect program couldn’t find Raspberry vein chlorosis virus (RVCV) that was positive in Velvet analysis and molecular diagnostics. Sequencing and phylogenetic analysis revealed RYNV strain that was like Canadian isolate (KF241951.1). Many new Badnavirus-like sequences were detected, but possible integration of Badnaviruses into raspberry genome was not excluded in this study. BRNV isolates were closely related to previously detected Finnish BRNV isolates. RVCV isolates grouped to three clades where RVCV from sample 21 was the most similar to formerly sequenced Scottish isolate (FN812699.2). Multiple viral infections were detected in one sample amongst these raspberry varieties, which may indicate different kinds of virus-virus interactions. The most important finding of this study was that RYNV and RVCV are present in Finland. Secondly all the detected raspberry viruses are genetically diverse and multiple infections of detected virus species are common.