Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by study line "Genetiikka ja genomiikka"

Sort by: Order: Results:

  • Laiho, Elina (2021)
    The European rabbit (Oryctolagus cuniculus) is a small mammal native to the Iberian Peninsula, but introduced by humans to all continents except Antarctica. The rabbit has been a remarkably successful invasive species due to its generalist nature and fast reproduction. Its spreading has mostly been destructive to the local nature, and humans have used fatal rabbit diseases such as rabbit haemorrhagic disease (RHD) to control harmful populations. The rabbit population in Helsinki is one of the most northern annually surviving rabbit populations in the world. It is believed to have originated from escaped pet rabbits in the late 1980s, and in the early 2000s, the rabbits spread rapidly around the Helsinki area. RHD spread unintentionally to Finland in 2016, and the disease caused a significant reduction in the Helsinki rabbit population. Rabbit population genetics has previously been studied in several countries, but never before in Finland. The aim of the thesis was to examine the genetic diversity and population structure of the Helsinki rabbit population before and after the RHD epidemic, and to compare the results to similar preceding rabbit population genetic studies. Rabbit populations have previously been found to recover from major population crashes without a notable loss in genetic diversity using DNA microsatellite markers. The recent RHD epidemic in Helsinki provided an opportunity to study, whether a rabbit population can recover from a population crash even in a harsher environment without losing genetic diversity. To conduct genetic analysis, fourteen DNA microsatellite loci were genotyped from individuals caught during two distinct time periods, in 2008-2009 (n=130) and in 2019-2020 (n=59). Population structure was observed in both temporal rabbit populations with small but significant FST values. The 2019-2020 population was more diverse than the 2008-2009 population in terms of allele numbers and expected heterozygosity. This result was unexpected considering the recent RHD-epidemic but could be explained by gene flow from new escaped rabbits. Compared to other wild rabbit populations around the world, the Helsinki area rabbits exhibit significantly lower genetic diversity. Bottleneck tests showed a significant signal separately in both temporal populations, but the RHD bottleneck cannot be distinguished based on the tests. The results could be biased by new gene flow, or the initial bottleneck caused by the founder effect of only a few pet rabbits. The rabbits have demonstrated their adaptation and survival skills in the cold climate of Helsinki. The population has significantly lower genetic diversity compared to other wild populations, yet recovered from a major RHD epidemic without reduction in genetic diversity under these more extreme environmental conditions. It has been proven again; the rabbit is a thriving invasive species.
  • Lappalainen, Siiri (2023)
    Progressive retinal atrophy or PRA is a collective term for a group of hereditary degenerative retinal diseases in dogs. PRA affects the photoreceptor cells of the eye ultimately progressing into complete vision loss. Documented in over 100 breeds, it is the most common type of canine retinal diseases. PRA is considered a homologous disease to human retinitis pigmentosa, thus providing a large animal model for studying retinal biology and genetic aetiology of its diseases. The objective of this thesis was to study the genetic cause of a novel form of PRA in young Finnish Lapphunds. Analysis built upon a combination of gene mapping methods and analysis of next­ generation sequencing data. Gene mapping was performed with two analysis methods, genome­-wide association study and homozygosity mapping, utilising single nucleotide polymorphism microarray based genotype data. Identifying a clinical phenotype from the canine biobank at the University of Helsinki resulted in a study cohort of six case and 10 control dogs. Combined with pedigree information, this early­-onset PRA was most likely a new autosomal recessive condition in the breed. Genome­-wide analyses resulted in the discovery of a disease­-associated locus on chromosome 27. Findings of single nucleotide variant filtering of one whole-­genome sequenced affected dog led to the prioritisation of an intronic substitution variant (T > C) in SOX5 gene as a potential cause of PRA. Genetic validation of the variant with 23 dogs showed promising results. Four out of five affected dogs were homozygous for the variant, while controls were either wild-type or heterozygotes. As a result, a previously unknown disease locus was successfully identified, suggesting a possible new spontaneous canine model of retinitis pigmentosa. By better understanding the patho­physiological processes of disease, improved diagnostics and marker­-based testing as well as novel therapies can be developed for both dog and man. However, further studies are needed to understand the underlying molecular mechanism of the candidate disease variant.
  • Nihtilä, Julia (2021)
    Henoch-Schölein purpura (HSP) is a vasculitis of small vessels and its characteristics include abnormal accumulation of IgA immunocomplexes on vessel walls as well as abnormal glycosylation patterns of IgA. HSP is an autoimmune disease like inflammatory bowel diseases (IBD). The genetic background of HSP has not been studied in Finnish population before, and only one genome-wide association study has been conducted for HSP before. Therefore investigating the Finnish genetic associations of HSP on a genome-wide level is of value. In this study the genetic background of HSP is studied with genome-wide association analyses performed on 424,041 genotyped SNPs passing quality control, HLA alleles imputed from the SNPs, and for their allele-level HLA protein sequences with the aim of replicating previous HSP associations in a Finnish cohort. There were 46 HSP individuals and 18,757 controls (216 bone marrow donors and 18,541 blood donors) passing quality control and included in the study. R package HIBAG was used for HLA imputation, and SPAtest package was used for the association analyses. In the association analyses, a region in chromosome 6 passed genome-wide significance (SNP with the smallest p-value: p 6,57 x 10-10, OR 0.14[0.1-0.2]) and the region contained both predisposing and protective associations. Of HLA alleles, DQB1*05:01, DQA1*01:01 ja DRB1*01:01 surpassed genome-wide significance level (p values 4,99 x 10-9, 1,04 x 10-8 and 2,37 x 10-8, respectively) and were positively associated with HSP. Five amino acid positions were significantly associated with HSP (p-values 3,9 x 10-10, 7,37 x 10-9, 1,26 x 10-8, 1,69 x 10-8 and 2,41 x 10-8), being both protective and predisposing to HSP. In addition, the genetic background of HSP was compared with that of IBD by comparing their GWAS results of genotyped SNPs, HLA alleles and their protein sequences. There were 49 IBD patients after quality control, and the same controls as for HSP (18,541 individuals) were included in the association analyses of IBD. The diseases seem to share some of their genetic background. According to the results, HSP seems to associate primarily with HLA class 2 and the result is also compatible with previous studies linking HSP to this region. The results also replicate previous GWAS findings in HLA class 2. According to this it is likely that the same HLA alleles are notable genetic factors in both Finnish and Spanish populations. The connection between HSP and IBD could potentially have to do with intestinal microbes aiding the onset of autoimmune diseases in genetically susceptible hosts.
  • Högel, Caroline (2022)
    The aim for this project is to set up a high-content imaging pipeline for phenotypic analysis of single cells in peripheral blood mononuclear cell (PBMC) samples from healthy blood donors. The blood donors selected for the optimization experiments are known to carry specific allele variants of interest, based on an earlier FinnGen study. The main question is whether these genetic differences result in phenotypic changes in the PBMCs that can be identified by microscopic imaging and AI-guided image analysis. In this Pro Gradu work, I have optimized the pipeline of PBMC sample handling, immunostaining, and phenotypic imaging. PBMCs were gathered from healthy donors at the Blood Service Biobank. The frozen PBMC samples were thawed, and cells were plated on 384-well plates prior to immediate fixation with paraformaldehyde. The cells were then stained with fluorescent cell markers based on the Cell Painting assay (Bray 2016), followed by wide-field and confocal imaging with Opera Phenix high-content confocal microscope (FIMM High Content Imaging and Analysis unit). Novel deep learning methods are now being developed (Pitkänen group) to automatically learn phenotypes from the collected imaging data and associate them to the donor’s genotypes. We also used in-house tools for cell segmentation and further analysis as well as quality control (Paavolainen group). Primary results based on the features extracted from acquired images showed promising cell type - and donor -type specific clustering.
  • Turku, Teemu (2024)
    Distal myopathies are a group of rare progressive genetic muscle disorders that are extremely varied both genetically and clinically. Typical symptoms include weakness and atrophy limited to the skeletal muscles of distal extremities in hands and legs. The age of onset ranges from early childhood to late adulthood depending on the disease. Currently around 30 genes have been associated with distal myopathies, most of them causing a dominant disease. The objective of the thesis was to identify the disease-causing variant in a family affected by autosomal dominant distal myopathy with early adulthood onset. Affected family members expressed weakness and atrophy in muscles of both hands and legs. To narrow down the chromosomal location of the disease-causing variant, linkage analysis was conducted with genome-wide single nucleotide polymorphism data of family members. Because of the progressive nature of the disease and uncertain disease status of one family member, linkage analysis had to be repeated a few different times with different settings. Both disease statuses and pedigree size were altered to account for the possibility of presymptomatic carriers or incomplete penetrance. Analyses with different parameters led to discovery of multiple possible co-segregating regions. Rare co-segregating small-scale and structural variants as well as repeat expansions in these regions were examined from next-generation sequencing data with multiple bioinformatic detection tools. The segregation of possible candidate variants was validated with Sanger sequencing and PCR. Ultimately, no likely rare co-segregating variant of any type of genetic variation with a likelihood to cause a disease such as distal myopathy was identified by any detection method used. Lack of potential disease-causing variant could be due to incomplete penetrance of the variant or if it was in non-coding regions, such as a deep intronic splicing variant in a gene currently not known to be connected to muscles.
  • Müller, Linda Helena (2022)
    Puberty initiation is a crucial physiological process in human development. A group of hypothalamic neurons secreting the gonadotropin-releasing hormone (GnRH) and expressing the kisspeptin receptor (KISS1R) plays a key role in launching puberty. Furthermore, cellular KISS1R signaling has been shown to regulate GnRH expression and secretion. Although the in vitro differentiation of human pluripotent stem cells into GnRH-secreting neurons has been successful, it is of high interest to generate KISS1R expressing GnRH neurons. By utilizing the CRISPR activation technology, this study aimed to establish a conditional KISS1R-activation cell line using H9 human embryonic stem cells. Through controlling dCas9VP192 abundance using the Tet-On system combined with the dihydrofolate reductase destabilizing domain, the transcriptional activation of KISS1R was temporally regulated by the addition of two antibiotic drugs - doxycycline and trimethoprim. KISS1R expression was primarily assessed by qPCR and verified by immunocytochemistry and the use of a KISS1R-GFP reporter cell line. The main finding of this study is the achievement of a 6217 ± 2286 fold change in KISS1R transcription by introducing two guide RNAs (N = 3). Nevertheless, leaky gene activation was observed without drug treatment (fold change of 63 ± 51). Concludingly, this study successfully led to the generation of a KISS1R-activation cell line. After further characterization and refinement of the activation protocol, the established cell line will enable to investigate whether KISS1R upregulation modulates in vitro GnRH neuron differentiation, electrophysiology, hormone expression, and secretion in the future. Respective outcomes may lead to advances in understanding and treating pubertal disorders.
  • Wei, Xiaodong (2022)
    The composition and dynamics of the early life gut microbiota plays a major role in establishing neonatal immunity and is suggested to have multiple impacts on the child’s long-term health. Meanwhile, the composition of the infant gut microbiome has been shown to be affected by the birth mode, infant health and diet. However, the characterization of the infant gut microbiome and its impact on the host’s health is still challenging as the contribution and importance of multiple co-factors on the early microbiome during infant growth is still poorly understood and characterized. The Health and Early-life microbiota (HELMi) is a cohort of more than 1000 healthy Finnish infants currently followed from birth to 4-5 years old. By now, the HELMi dataset comprises more than 400 whole genome shotgun metagenomes obtained from stool samples from 80 infants and parents, but also an in-depth characterization of the families’ lifestyle, environment, health and nutrition, allowing for a precise and cutting-edge characterization of the early gut microbiota. Based on the datasets from the HELMi, this project used Metaphlan3, Kraken and Braken to determine the best computational approach for the taxonomic profiling of the metagenomic reads. Then a PERMANOVA test was performed to evaluate and determine the factors significantly associated with the compositional microbiota variation within the infant gut metagenomes. This study first identified technical factors introducing bias in taxonomic profiling (e.g., DNA extraction batch), which served as confounders in the analysis of environmental and host variables. The investigation of these biological factors indicates that pre-natal and peri-natal variables such as the mode of delivery significantly impact the infant gut microbiota, while we did not identify any significant impact of breastfeeding habits and medication exposures in this study.
  • Uriona Egia, Garazi (2023)
    The ends of eukaryotic chromosomes are formed by a special heterochromatic structure, the telomere, which is essential to guarantee chromosome stability. Telomeres protect chromosomic ends from DNA degradation, repair, and recombination events. However, they are difficult to replicate due to their repetitive and heterochromatic nature, which hinder DNA replication fork progression. In yeast, Mph1 helicase promotes replication fork regression, cross-over suppression during homologous recombination (HR), and telomere maintenance. Moreover, Mte1 is a D-loop binding protein involved in response to DNA damage and maintenance of telomere length, which interacts with Mph1, thereby stimulating its regression capacity as a helicase and fork. Thus, the Mte1-Mph1 complex is recruited to stressed telomeres. Mte1 also shares a domain of unknown function, DUF2439, with Rad51 and Rdh54. Additionally, Esc2 protein is involved in the regulation of DNA damage through template switch (TS) recombination, preventing HR events caused by Mph1. This thesis aimed to uncover the potential roles and interactions of proteins involved in telomere maintenance, such as Mph1, Mte1, Esc2 and Rdh54, for which two main assays were conducted: (1) Telomere Stability assay, consisting of Tus/Ter barrier based on the high-affinity binding of the E. coli protein, Tus, to specific DNA sequence called Ter; (2) Template Switching assay, focused on the capability of the proteins in reconstructing a functional LYS2 gene by TS. The obtained results demonstrated that (1) the absence of Rdh54 enhances replication fork regression, (2) Mte1 and Esc2 show opposite roles in telomere maintenance, (3) the interaction between Mte1 and Rad51 plays a crucial role in ensuring telomere stability and nuclear foci formation, (4) Mph1 and Mte1 promote cell survival through the break-induced replication (BIR) pathway. Further studies should assess the plausible interaction between Mph1 and Rdh54 proteins and characterize the function and interplay of the proteins involved in TS.
  • Perkiö, Anna (2021)
    Long interspersed nuclear element 1 (LINE-1 or L1) belongs to a class of retrotransposons. In other words, it is a DNA element that can copy and paste itself around the genome. There are approximately 500,000 copies present in humans, but only around 5,000 are expected to remain transcriptionally competent. The activity of L1s is generally strongly repressed in normal human tissues, but in many cancers, these elements are reactivated. Both L1 transposition and transcription can have significant effects on cellular function, making it an interesting topic of research from a pathological point-of-view. By studying and understanding more about this transposon, it could be possible to find novel screening methods or even therapeutics for different cancers. One of these cancer types is high-grade serous ovarian carcinoma (HGSOG), which is known for exhibiting L1 upregulation. However, the quantification of L1 transcription has been proven to be very challenging, mostly due to alignment issues caused by the repetitive nature of the element. In addition, a large proportion of L1s reside within genes, meaning that L1 sequence -containing transcripts frequently do not originate from the L1’s own promoter. This thesis aimed to tackle these challenges; I quantified L1 expression at the single-locus level in 11 pre- and post-chemo HGSOC sample pairs, as well as in 5 samples from healthy women, based on single-cell RNA-sequencing. In addition to comparing L1 activity in different sample and cell types, I researched whether L1 activity was associated with any changes in gene expression. The poly(A) site of an L1 is relatively weak, meaning that L1 transcription frequently extends over it. Based on this fact, the utilized approach was to quantify L1 expression based on reads mapping to the 1 kilobase downstream window of each L1 locus, thus minimizing the alignment issues of repetitive elements. Thereafter, the features of the detected loci were carefully assessed to separate false-positive L1s from those with evidence supporting genuine activity, such as tumor sample enriched expression, lack of correlation to host gene, and detection with bulk RNA-sequencing. The activity of the latter loci was then further analyzed to search for differences in L1 expression between pre- and post-chemo samples. In addition, the association between L1-activity and gene expression was examined based on regression models both at the individual gene and molecular signature gene set-level. It was found that L1 expression data is filled with factitiously active loci, highlighting the importance of careful analysis and wet lab validations when studying transposon activity. However, regardless of the issues arising from a sparse and unreliable dataset, I showed that L1 activity was negatively associated with the expression of MYC target genes. MYC has been previously shown to be a transcriptional repressor of the L1, indicating that the obtained results are legitimate. Even though the results obtained from this study appear to be biologically justifiable, they would require further validation to ensure their authenticity. In addition, for the future it would be essential to enhance the sensitivity of the utilized workflow to minimize the sparsity of the data, so that statistical analyses performed would become more reliable. Nevertheless, it was shown that assessing L1 expression at the single-cell level using RNA-sequencing is executable.
  • Dreilinger, Olivia (2023)
    Animal coloration is as striking as it is diverse; however, the transcriptional basis of coloration is not deeply understood. Cichlid fishes are a tractable system for studying coloration as they exhibit a wide range of phenotypic diversity while remaining genetically similar. This facilitates the study of genotype-phenotype correlations and the identification of causative genes. RNA sequencing is a powerful approach to investigate the genes which characterize chromatophores. However, RNA-seq results can be plagued by the high abundance of rRNA in cells. This thesis aims to investigate differential gene expression between differently pigmented regions as well as explore the effects of tissue treatments and rRNA depletion on gene expression. Gene sets acquired with polyA selection, riboPOOL probes optimized for zebrafish, and zebrafish probes complemented with newly designed riboPOOL cichlid probes were compared to assess the functionality of these different rRNA depletion strategies. The use of zebrafish probes complemented with newly designed cichlid probes captured the greatest diversity of genes, many transcripts of which were missing from the other gene sets. Furthermore, as experiments such as scRNA-seq rely on a dissociation step, the effect of dissociation on gene expression was examined and found to promote the expression of stress response genes. The results of this upstream optimization were applied in the analysis of differential gene expression between the vertical stripes of the cichlid Pseudotropheus demasoni to better understand the molecular basis of vertical striping in fish. The dark stripes exhibited upregulation of melanic marker genes and the light, iridescent stripes showed an increase in iridophore marker gene expression. These findings were corroborated with cell count data from FACS to link transcriptional profiles and cell type quantifications. Overall, the study provides insight into the transcriptional basis of coloration in cichlid fishes and underscores the importance of optimizing methods drawing meaningful conclusions.
  • Iacoviello, Francesco (2022)
    Neurodevelopmental disorders (NDDs) are disabilities in which the formation and development of the central nervous system is altered. NDDs severely impact the quality of life of the individuals that are affected by them, however little is known about the causes or the molecular mechanisms that are behind their onset. For this reason, being able to model them is pivotal to our society since, by understanding the mechanisms underlying such disorders, we could develop possible treatments. Previous research has suggested that disturbances in the early neuronal development could be at the basis of NDDs onset. Therefore, in this work, I have modeled neuronal differentiation in Kabuki syndrome (KS), a known NDD, assaying the expression of key early neurodevelopmental markers at four specific timepoints, using induced pluripotent stem cell (iPSC) technology. By concurrently differentiating three KS patient-derived and three control iPSC lines to neural precursor cells (NPCs) and profiling them with immunocytochemistry (ICC) and quantitative real-time PCR (RT-qPCR), I was able to identify differences in the early developmental trajectories of NPCs between the two conditions. The ICC data suggested that differentiating KS cell lines incur in precocious differentiation when compared to control cell lines, suggesting that the disease-causing mutations could lead to accelerated neuronal maturation of early NPCs. However, RT-qPCR analysis of the expression patterns of key neurogenesis markers was unable to statistically confirm the observed trend between the two phenotypes, likely due to limitations in statistical power. Despite this, the expression of four out of seven NPC markers was higher in early KS cells than in control cell lines, supporting the hypothesis of accelerated neuronal maturation. Taken together, this work highlighted some of the challenges related to iPSC-based disease modelling studies, and the need to further confirm the inferred mechanisms of asynchronous neuronal development observed in this work.
  • Sundaresh, Adithi (2022)
    Human induced pluripotent stem cells (iPSCs) are an important in vitro model of disease and development. iPSCs can be differentiated in culture into cell types which are difficult to access from patients, such as neurons. Applying iPSC-derived cellular models to disease studies requires a thorough characterization of the derived cell types, as well as assessing reproducibility across cell lines or differentiation batches. With the aim of providing such a comprehensive molecular characterization at an early stage of cortical neuronal differentiation in vitro, six iPSC lines from four donors were differentiated to cortical neural progenitors using a modification of an established protocol (Shi et al., 2012a). The protocol successfully produced neural progenitors, with over 75% of the differentiated cells aligning with a cortical identity, as confirmed via qPCR and immunocytochemistry of established markers such as PAX6, NES and SOX1. To further classify the cell types produced as well as identify potential differences between cell lines, gene expression of the obtained cells was profiled with single cell RNA sequencing of ~22,000 cells, which uncovered the heterogeneity of neural progenitors produced. Further, although two differentiation batches produced similar cell-type compositions on a whole, a fraction of the lines showed inter-individual differences in cell type composition, which correlated with expression variability of known marker genes. Additionally, the cell types produced in vitro were compared to those produced in vivo by mapping our dataset to a reference fetal brain dataset (Polioudakis et al., 2019). It was observed that the in vitro dataset represented a subset of the cell types present at mid-gestation. Overall, the single cell characterization of differentiated cells allowed greater resolution in understanding cell-type heterogeneity of cortical neurogenesis, which is of key relevance for future applications such as disease modeling.
  • Patrikainen, Linda (2023)
    Breast cancer is globally the leading cause of death in women. ER positive, HER2 negative breast cancer is the most common subgroup, covering two thirds of all breast cancer cases. The different isoforms of ERα, ERα66 and ERα36 are responsible of genomic and non-genomic ER signaling respectively. Tamoxifen is one of the most used drugs in ERα+ breast cancer. As a SERM tamoxifen blocks the activity of ERα66, but plays as an agonist for ERα36, which is associated with tamoxifen resistance. Tamoxifen resistance concerns more than 25% patients with ERα+ breast cancer but the molecular mechanisms that lead to development of resistant disease remain uncovered. Thus, the aim of this thesis was to reveal how two different ERα isoforms are used and regulated in tamoxifen resistance in two commonly used ERα+ breast cancer cell lines MCF7 and T47D. We studied the effect of hormones to tamoxifen sensitivity and to utilization of ERα isoforms. Additionally, we compared the transcriptomics of resistant and parental cells in both cell lines and tested how inhibition of key regulators affect the sensitivity against tamoxifen. In this thesis we report that MCF7 and T47D cell lines obtain different mechanisms of tamoxifen resistance, and that the development of tamoxifen resistance is a parallel process with the cell identity switch from luminal to basal. The EZH2 is involved in maintaining the luminal progenitor type of mammary cells, whereas c-Myc is highly expressed in the resistant cell lines. Hence, EZH2 and c-Myc are key players in development of tamoxifen resistance and could be considered as therapy targets in ERα+ breast cancers.
  • Vakkari, Eeva (2021)
    The wide distribution of Scots pine (Pinus sylvestris L.) in boreal forests and the outstanding properties of its wood have made it an economically significant resource at the forest sector. The highly valued chemical and mechanical properties of Scots pine wood are related to heartwood, a specialized tissue forming the innermost part of a mature trunk. Decay resistance of Scots pine wood is largely defined by heartwood extractives of which the stilbene pinosylvin has the highest quality trait breeding interest. Pinosylvin concentration is a high-heritability trait that positively correlates with the heartwood decay resistance. Pinosylvin biosynthesis pathway is upregulated both developmentally at the mature tree transition zone between sapwood and heartwood and as stress response in various tissues of young trees. Identification of the regulators of pinosylvin synthase could speed up quality trait breeding providing a basis for variant screening in the natural populations and for analysing functional properties of the variants. Early genotyping would enable selection of the desired quality individuals before the start of developmental pinosylvin production and significantly accelerate breeding programs. Scots pine pinosylvin synthase PST-1 is proposed to be both stress-induced and developmentally regulated. Previous studies have identified several MYELOBLASTOSIS (MYB) domain transcription factors (TFs) that co-regulate with stilbene pathway transcripts under pinosylvin production inducing conditions or that have promising homologs in other species. In this study, eight Scots pine MYB TFs were examined in PST-1 promoter interaction studies using quantitative luciferase assay and yeast one-hybrid assay. This study aimed to clone the MYB coding sequences and confirm the integrity and MYB character of the proteins they encode, and to verify whether any of the MYB TFs are direct regulators of PST-1, and to characterize the regulatory functions of the MYB TFs as activators or repressors. This study identified one MYB TF as a direct regulator of PST-1 whereas the other studied MYB TFs did not bind the most promising MYB target elements in the promoter. The discovery of a direct regulator of pinosylvin synthase provides a potential marker for early selection making the finding highly valuable for quality trait breeding efforts. Additionally, another MYB TF was detected as a potential indirect regulator of pinosylvin biosynthetic pathway or as a regulator of neighbouring pathways suggesting that it would also be an interesting target for further studies. The MYB TFs were successfully cloned and seven out of eight MYB TFs were classified into MYB subfamilies. Tentative characterizations for the MYB TFs were presented based on the sequence analysis. The Gateway compatible vectors generated in this study will facilitate future experiments. The MYB coding sequences were incorporated in the verified entry clones ready-to-use in generation of other types of expression vectors. The MYB TF plant vectors could be directly used in Arabidopsis, as well. Two multisite Gateway compatible entry clones for N-terminal fusions to VP16 and SRDX transcriptional regulatory domains were generated for the plant expression vectors. The protocol developed for the 3’ fusion entry clones comprises of sequential polymerase chain reactions easily applicable for other cloning purposes. The yeast one-hybrid prey vectors could be utilized not only in another one-hybrid but also in two-hybrid studies. Several of the MYB TFs, including the PST-1 direct regulator, were hypothesized to interact with other types of TFs. The protein – protein interaction studies would detect possible co-factors involved in the MYB TF mediated regulation of Scots pine pinosylvin synthase. Identification of each member in the regulatory complexes would enable targeting the quality trait breeding efforts most effectively
  • Olkkonen, Emmi (2021)
    Long non-coding RNAs (lncRNAs) are over 200 bp long RNA molecules that are not translated into protein. LncRNAs can regulate the expression of protein coding genes, and studies have indicated their role in stress response. Stress response has also been associated with differences in the structure of the myelin sheaths in the mouse brain cortex. Myelin is produced by mature oligodendrocytes (OLGs), and therefore, OLGs are likely to play a role in stress response. The aim of this thesis was to find lncRNAs differentially expressed in the oligodendrocytes and myelin on the medial prefrontal cortex of stressed mice in comparison to controls. Mice of strains C57/6NCrl and DBA/2NCrl, differing in stress response, were exposed to chronic social defeat stress. After the stress paradigm, the mice were assigned as stress-susceptible or stress-resilient, the susceptible mice exhibiting anxiety-like behavior. RNA from OLGs and myelin from the medial prefrontal cortex of the mice was sequenced, and I compared the lncRNA expression levels between stressed and control mice and stress-susceptible and resilient mice using bioinformatic methods. I also assessed modules formed by lncRNAs and protein coding genes correlating in expression in both datasets. I used RT-qPCR to investigate if results from two differentially expressed lncRNAs, Gm37885 and Neat1, replicate in a stress hormone-treated oligodendrocyte cell line. Three hundred and seventy lncRNAs were differentially expressed between stressed mice and controls or stress-susceptible and resilient mice in the OLG dataset and 132 in the myelin dataset. Two hundred and 87 of them overlapped with a protein coding gene in the OLG and myelin datasets, respectively. Sixty-one percent of the differentially expressed lncRNAs were specific to comparisons in the OLG dataset and 73 % in the myelin dataset, but 39 % of the differentially expressed lncRNAs in the OLG dataset and 27 % in the myelin dataset were shared between them. No module of genes with correlating expression levels was associated with stress, but the expression levels of two correlation modules from each dataset differed between strains. The results for one of the differentially expressed lncRNAs, Gm37885, replicated in stressed Oli-neu cells in RT-qPCR. The results of my thesis indicate that multiple lncRNAs are involved in the mouse stress response, as many were differentially expressed and shared between phenotype comparisons. Additionally, significant gene expression differences were observed between strains, which could contribute to the previously reported strain differences in stress susceptibility. The results also suggest a specific role of Gm37885 in GR-mediated stress response. However, the function of Gm37885 remains unknown, and further studies regarding Gm37885 and the other differentially expressed lncRNAs should be carried out to draw conclusions of their contribution to the OLG-mediated stress response.
  • Tommila, Jenni (2021)
    Bacteraemia, the presence of bacteria in the bloodstream, may lead to severe and costly health issues. Sepsis, a serious complication of bacteraemia, is one of the top causes of mortality globally. Early and specific diagnostics as well as fast acting are essential in successful treatment. However, current diagnosis relies mainly on time-consuming blood culturing and clinical symptoms, which are unspecific for the causative agent. With the advanced technology and decreasing cost, state-of-art sequencing-based (Next generation sequencing) methods provide a new way to investigate the bacteria present. Metagenomics, which means sequencing and studying all DNA extracted from a microbial community sample, is widely used, but it only describes the genetic potential of a community and does not differentiate live from dead microbes. Metatranscriptomics, in which essentially all RNA from a sample is sequenced, provides information about expression and activity together with identification of viable bacteria, However, the high amounts of host cells and host RNA complicate the detection of bacterial transcripts from complex host-microbe samples. In this thesis, I investigated solutions for the efficient isolation and enrichment of bacterial RNA from whole blood to be used in sequencing and metatranscriptomics analysis. Firstly, I tested the capability of bacterial cell lysis of two commercial blood sampling tubes with Escherichia coli and Staphylococcus epidermidis suspensions. Both tubes, Tempus and RNAgard, were able to lyse gram-negative E. coli cells and good-quality RNA was extracted in measurable quantities with their respective RNA extraction methods. With Tempus tubes the RNA yield was clearly higher. With gram-positive S. epidermidis, RNA quantities from both extractions were below the measurement limits indicating insufficient lysis and need for further optimization. Secondly, I investigated the depletion of polyadenylated (poly-A) transcripts in order to reduce the host transcripts and thus to enrich the bacterial transcripts prior to costly sequencing step. I evaluated the performance of a previously designed in-house protocol, based on the capture of poly-A -transcripts with oligo-dT -beads, and tested different parameters to see whether the depletion efficiency could be enhanced. Most significantly, the amount of oligo-dT -bead suspension was reduced to half from the original protocol. In-house protocols were also compared to a commercial solution, which they clearly outperformed. Depletion performances were tested with a RT-qPCR and dot blot assay, which I designed along this thesis work. Finally, to make the poly-A depletion better suited for blood samples infested with globin transcripts (representing up to 80% of all poly-A transcripts extracted from whole blood), I tested and successfully pipelined the leading commercial method for depleting globin transcripts with the in-house poly-A depletion protocol. The optimized sample preparation protocol provides a platform for further bloodstream infection and sepsis studies. Next steps of the process, such as sequencing and testing with clinical samples, are already ongoing with promising preliminary results. In the future, the metatranscriptomics approach can be utilized in fast and specific identification of the pathogens and their antibiotic susceptibilities. In addition, infection mechanisms and host-pathogen interactions may be studied possibly providing novel insights for sepsis diagnostics and treatment.
  • Talka, Markus (2022)
    Acute leukemia is a life-threatening disease of blood and bone marrow, which is caused by malignant transformation of immature white blood cells. These malignant white blood cells invade space in bone marrow decreasing its ability to produce normal blood cells, eventually leading to death within weeks after the diagnosis without treatment. The acute leukemia can be broadly divided into its lymphoblastic and myeloid form, based on the affected cell lineage. Furthermore, acute leukemias can be classified based on different genomic features, such as gene fusions. Fusion genes are strong drivers in various cancers such as acute leukemias, and they are formed when two or more original genes join together forming a novel hybrid gene. If the novel hybrid gene is transcribed, it can lead to a translation of an abnormal fusion protein with altered function. The detection of the gene fusions is very important, since it affects to diagnosis and treatment of the patient. Various techniques can be used for fusion gene detection, of which the RNA sequencing is the method of choice, due to its ability to provide an unbiased identification of all known and novel gene fusions from the sample in a single experiment. In this thesis, the overarching aim was to develop an optimal sampling protocol for fusion gene detection using RNA sequencing for acute leukemia diagnostics. First, the whole blood samples in EDTA-tubes were collected from acute leukemia patients based on the findings from routine diagnostics. Next, the RNA was extracted at three different timepoints (0h, 8h, and 32h). The samples were stored at 4°C between the extractions. Finally, the RNA sequencing libraries were constructed, and the RNA sequencing was performed. After the sequencing, the data was analyzed using the FusionCatcher algorithm for fusion gene detection and the EdgeR-package for differential expression analysis. The FusionCatcher detected the same gene fusion in all the four fusion gene positive patients compared to routine diagnostics. However, the FusionCatcher failed to recognize the gene fusion in some of the samples with very low number of fusion breakpoint-spanning reads. These reads were visualized with IGV, suggesting that the detection failure resulted from the very low number of break-point-spanning reads. Furthermore, the sample storage did not affect on gene fusion detection. In addition, FusionCatcher detected PIK3AP::BLNK gene fusion from one of the fusion gene negative patients, suggesting a possibility that the patient truly was fusion gene positive. The differential expression analysis revealed changes in gene expression between the different timepoints. The results showed changes in various pathways related for example to cell death and protein biosynthesis, but also to pathways related to cancer. The results showed that prolonged sample storage alters the gene expression profile thus affecting the results of a gene expression study.
  • Owusu, Rafaela (2022)
    High-throughput sequencing techniques make it possible to identify DNA variants at a reasonable cost, representing a first-tier diagnostic test for rare mendelian diseases. However, a substantial number of variants identified through the analysis of sequencing data are frequently classified as variants of uncertain significance (VUS). Accordingly, only 30–60% of individuals receive a conclusive molecular diagnosis depending on the clinical phenotype. Reanalysis of older sequencing data has been encouraged by recently developed and improved methodologies for analysis and more robust bioinformatic pipelines to enhance variant interpretation and raise the diagnostic/detection rate. This study focused on reanalyzing data from a targeted gene panel, MYOcap, a targeted gene panel for patients with neuromuscular disorders. The aims were to find elusive (i.e., previously undetected/misinterpreted) variants in patients still missing a molecular diagnosis and, by using novel bioinformatic tools, focusing on pathogenic and likely pathogenic variants (according to ACMG guidelines) in Varsome as well as on variants affecting the splicing as predicted by SpliceAI. With this setting, the detection rate of solved cases increased by 2,7% in the first cohort and 0,5% in the third. This study suggests that additional data, such as segregation data or transcriptomic and proteomic data are essential for reducing the number of VUS and increase the detection rate. Notably, this study represents an essential first step of a larger reanalysis project, aiming at providing a diagnosis to an increasing number of myopathy patients.
  • Hiltunen, Antti Olavi (2022)
    Triple-negative breast cancer (TNBC) accounts for 10-15% of all breast cancer cases and has the worst clinical outcome. Characterizing features of TNBC are high recurrence and mortality rates, and the absence of three commonly targetable breast cancer biomarkers estrogen receptor, progesterone receptor, and HER2, limiting the number of targetable therapy options. Cytotoxic CD8 positive T cells play a crucial role in the anticancer immune response and act as a major component of successful cancer immunotherapies. However, cancer cells can evade T cell-mediated killing by overexpressing programmed death-ligand 1 (PD-L1) resulting in T cell exhaustion and limited immune response via the interaction with programmed death protein 1 (PD-1). Systemic anti-PD-L1/PD-1 therapies aim to prevent this immunosuppressive mechanism, but they are burdened with potentially life-threatening autoimmunity-type adverse effects. Therefore, cancer cell-specific targets to downregulate PD-L1 could offer efficacious and less harmful ways to overcome PD_L1/PD-1 mediated immunosuppression. Serine protease hepsin is commonly overexpressed in many solid tumors where it is responsible for the activation of HGF/MET signaling pathway as well as degradation of desmosomes and hemidesmosomes leading to the loss of epithelial integrity, invasion, and metastasis. Earlier studies have linked hyperactive HGF/MET pathway to the upregulation of immune checkpoint molecule PD-L1. In this thesis, I show how pharmacological inhibition of hepsin leads to decreased MET activity and downregulation of PD-L1 in a panel of TNBC cell lines. My results demonstrate the potential of hepsin-mediated regulation of PD-L1 in tumor immunosuppression, and hint at the potential of hepsin as a therapeutic avenue towards safe and efficacious immunotherapy in the future. These results are part of a larger study addressing the role of hepsin as a regulator of PD-L1 breast cancer.
  • Begum, Sakina (2021)
    Bartonella species are facultative intracellular bacteria causing variety of diseases in humans and also infects endothelial cells and erythrocytes. Some Bartonella species utilize VirB/VirD4-type IV secretion system (T4SS) in order to secret Bartonella effector protein A (BepA) which infects endothelial host cells by inhibiting the apoptosis. But the enterotoxin homolog in Bartonella gene A (EhbA) and the enterotoxin homolog in Bartonella gene B (EhbB) are found in the non-BepA Bartonella strains. In my Master’s thesis, I study the host cell binding activity and identify host cell surface receptor of EhbB in Bartonella. In my thesis, the cell adhesion of multimeric B proteins of enterotoxin homologue in Bartonella (Ehb) have been analyzed with cell adhesion assay using HEK293T, HeLa 229, Ea.hy926, and CHO-K1 cells. The assay was conducted with EhbB1 and EhbB 1-1C proteins from Bartonella Bovis strain Bermond and Bartonella strain spp 1-1C and the experiment indicated the cell adhesion activity of both EhbB proteins compared to the controls used in the experiment. Moreover, the binding activity of EhbB1 with Ea.hy926 was studied at several incubation time points, such as; 30 min, 2 hours, 4 hours, 6 hours, and 8 hours. Several incubation period of EhbB1 and EhbB 1-1C with Ea.hy926 cells did not enhance cell surface adhesion because the same absorbance compared to controls. The interaction of EhbB1 with cell membrane HEK293T was studied by using western blot on cell membrane preparation from Ea.hy926 cells which was used to identify possible protein receptor of EhbB1. The experiment suggests that EbB1 is binding to receptors present on the cell membrane of HEK293T which could be protein. The cell adhesion activity of HEK293T cell membrane with EhbB1 was analyzed by inhibition assay. This experiment indicated that EhbB1 protein attached to cell surface receptors present on the HEK29T cell membrane, which inhibited EhbB1 protein to attach to Ea.hy926 cells. This also indicate that the cell surface receptor for EhbB1 could be protein but requires further study.