Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Title

Sort by: Order: Results:

  • Huttunen, Heli (2023)
    Eye plaque radiotherapy is a treatment method of ocular tumors: A sealed radiation source is temporarily placed on the surface of the eye in day surgery. Compared to externally delivered conventional radiation treatments, more precisely targeted brachytherapy allows a higher dose in the target tissue while keeping the dose to healthy tissue relatively low. In Finland, all eye plaque treatments are centralised in Helsinki and brachytherapy of the eye is performed annually on approximately 70 patients. Patient specific anatomy takes into account determination of specific location and shape of the tumor in respect of radio-biologically critical structures of the eye. Until now, this has not been systematically modeled in dose calculation of eye plaque brachytherapy at HUS. The new version of Plaque Simulator, a 3D treatment simulation and modeling package for I-125, Pd-103, Ir-192 and Ru-106 plaque therapy of ocular tumors, enables importation and digitisation of patient imaging data (fundus imaging, CT and MRI) which consequently allows for systematically accurate estimation of dose distribution not only in the tumor but also in surrounding healthy tissues. The aim of this Master’s thesis is to prepare the new version of Plaque Simulator simulation and modeling package for clinical use in patient dose calculation at HUS. A comparison is done between the dose calculation method of the old and the new version of Plaque Simulator, and the dose calculation parameters as well as the plaque modeling parameters are reviewed. The function of the image-based dose calculation method is also tested with an anonymised patient treated for a tumor of a more peculiar shape. The absorbed dose to water on the central axis of the radiation source is measured experimentally for two individual I-125-seed along with Ru-106-CCB-, I-125-CCB-, and two I-125-COB-plaques. Experimental results are compared with the results obtained from Plaque Simulator. Individual I-125-seed is used to calibrate the detector at a distance of 10 mm, yielding to a calibration factor of 0.808. The use of the gold parameter in the dose calculation is justified, and a dosimetry modifier of Plaque Simulator is found to be 1.226 for I-125-plaques. Ru-106-plaque measurements are not calibrated, making them only relative. However, an excellent correspondence is observed between Ru-106-plaque dose calculations in Plaque Simulator and the manufacturer’s certificate. The measurements are normalized to the manufacturer’s certificate with a normalisation factor of 1.117.
  • Soukainen, Arttu (2023)
    Insect pests substantially impact global agriculture, and pest control is essential for global food production. However, some pest control measures, such as intensive insecticide use, can have adverse ecological and economic effects. Consequently, there is a growing need for advanced pest management tools that can be integrated into intelligent farming strategies and precision agriculture. This study explores the potential of a machine learning tool to automatically identify and quantify fruit fly pests from images in the context of Ghanaian mango orchards in West Africa. Fruit flies provide a special challenge for computer vision-based deep learning due to their small size and taxonomic diversity. Insects were captured using sticky traps together with attractant pheromones. The traps were then photographed in the field using regular smartphone cameras. The image data contained 1434 examples of the targeted pests, and it was used to train a convolutional neural network model (CNN) for counting and classifying the fruit flies into two different genera: Bactrocera and Ceratits. High-resolution images were used to train the YOLOv7 object detection algorithm. The training involved manual hyper-parameter optimization emphasizing pre-selected hyper parameters. The focus was on employing appropriate evaluation metrics during model training. The final model had a mean average precision (mAP) of 0.746 and was able to identify 82% of the Ceratitis and 70% of the Bactrocera examples in the validation data. Results promote the advantages of a computer vision-based solution for automated multi-class insect identification and counting. Low-effort data collection using smartphones is sufficient to train a modern CNN model efficiently, even with a limited number of field images. Further research is needed to effectively integrate this technology into decision-making systems for pre cision agriculture in tropical Africa. Nevertheless, this work serves as a proof of concept, show casing the serious potential of computer vision-based models in automated or semi-automated pest monitoring. Such models can enable new strategies for monitoring pest populations and targeting pest control methods. The same technology has potential not only in agriculture but in insect monitoring in general.
  • Seshadri, Sangita (2020)
    Blurring is a common phenomenon during image formation due to various factors like motion between the camera and the object, or atmospheric turbulence, or when the camera fails to have the object in focus, which leads to degradation in the image formation process. This leads to the pixels interacting with the neighboring ones, and the captured image is blurry as a result. This interaction with the neighboring pixels, is the 'spread' which is represented by the Point Spread Function. Image deblurring has many applications, for example in Astronomy, medical imaging, where extracting the exact image required might not be possible due to various limiting factors, and what we get is a deformed image. In such cases, it is necessary to use an apt deblurring algorithm keeping all necessary factors like performance and time in mind. This thesis analyzes the performance of learning and analytical methods in Image deblurring Algorithm. Inverse problems would be discussed at first, and how ill posed inverse problems like image deblurring cannot be tackled by naive deconvolution. This is followed by looking at the need for regularization, and how it is necessary to control the fluctuations resulting from extreme sensitivity to noise. The Image reconstruction problem has the form of a convex variational problem, and its prior knowledge acting as the inequality constraints which creates a feasible region for the optimal solution. Interior point methods iterates over and over within this feasible region. This thesis uses the iRestNet Method, which uses the Forward Backward iterative approach for the Machine learning algorithm, and Total Variation approach implemented using the FlexBox tool for analytical method, which uses the Primal Dual approach. The performance is measured using SSIM indices for a range of kernels, the SSIM map is also analyzed for comparing the deblurring efficiency.
  • Kurki, Lauri (2021)
    Atomic force microscopy (AFM) is a widely utilized characterization method capable of capturing atomic level detail in individual organic molecules. However, an AFM image contains relatively little information about the deeper atoms in a molecule and thus interpretation of AFM images of non-planar molecules offers significant challenges for human experts. An end-to-end solution starting from an AFM imaging system ending in an automated image interpreter would be a valuable asset for all research utilizing AFM. Machine learning has become a ubiquitous tool in all areas of science. Artificial neural networks (ANNs), a specific machine learning tool, have also arisen as a popular method many fields including medical imaging, self-driving cars and facial recognition systems. In recent years, progress towards interpreting AFM images from more complicated samples has been made utilizing ANNs. In this thesis, we aim to predict sample structures from AFM images by modeling the molecule as a graph and using a generative model to build the molecular structure atom-by-atom and bond-by-bond. The generative model uses two types of ANNs, a convolutional attention mechanism to process the AFM images and a graph neural network to process the generated molecule. The model is trained and tested using simulated AFM images. The results of the thesis show that the model has the capability to learn even slight details from complicated AFM images, especially when the model only adds a single atom to the molecule. However, there are challenges to overcome in the generative model for it to become a part of a fully capable end-to-end AFM process.
  • Vesalainen, Ari (2022)
    Digitization has changed history research. The materials are available, and online archives make it easier to find the correct information and speed up the search for information. The remaining challenge is how to use modern digital methods to analyze the text of historical documents in more detail. This is an active research topic in digital humanities and computer science areas. Document layout analysis is where computer vision object detection methods can be applied to historical documents to identify the document pages’ present objects (i.e., page elements). The recent development in deep learning based computer vision provides excellent tools for this purpose. However, most reviewed systems focus on coarse-grained methods, where only the high-level page elements are detected (e.g., text, figures, tables). Fine-grained detection methods are required to be able to analyze texts on a more detailed level; for example, footnotes and marginalia are distinguished from the body text to enable proper analysis. The thesis studies how image segmentation techniques can be used for fine-grained OCR document layout analysis. How to implement fine-grained page segmentation and region classification systems in practice, and what are the accuracy and the main challenges of such a system? The thesis includes implementing a layout analysis model that uses the instance segmentation method (Mask R-CNN). This implementation is compared against another existing layout analysis using the semantic segmentation method (U-net based P2PaLA implementation).
  • Toivettula, Karolina (2021)
    Around the world, cities are using branding as a discursive and strategic practice to adjust to intensified, ongoing competition of tourists, investments, events and skilled labour. Simultaneously, in the era of the societal turning point, sustainability issues have become a global topic, and cities have begun to brand themselves as ‘pioneer’ in sustainability. Gradually, place branding’s potential as a strategic instrument of urban development and change has been understood, and therefore, it is increasingly applied in urban governance. This thesis focuses on this change in place branding and explores the relationship between place branding and sustainable development in the context of Helsinki’s branding. More specifically, I study how place branding can be harnessed as a transformative and strategic tool to further sustainable urban development. The theoretical foundation is built on place branding literature that takes into consideration the diverse and transformative role of place branding. I reinforce the place branding theory with the concept of imaginary, which are visions of the future utilised to steer decision-making and further policies. The imaginaries can act as technologies of governance, through which cities delegate responsibility for the citizens to guide them towards a specific aim, for instance, ‘Sustainable Helsinki’. My research data consists of strategies and a website produced by the City of Helsinki. The material addresses sustainable development and the City’s branding cuts through all content. I analyse the content through frame analysis to find how Helsinki frames itself in terms of sustainable development and if any imaginaries attempt to steer the citizens to take responsibility for their sustainability actions. My research findings confirm the increasingly common perception in place brand research according to which place branding can be used as a comprehensive strategic tool in urban development. In Helsinki, place branding has moved over from mere city marketing towards a governance strategy whose objective is to both manage perceptions about places and shape the place according to the city strategies or policies. Also, what stood out was the emphasis on economic sustainability, which was visible even in sections that addressed the other two dimensions – environmental or social. This finding highlights how Helsinki’s branding is heavily influenced by the common narratives of economic success and international competition. Central findings in my research were that Helsinki uses competitive and cooperative ways of portraying itself in sustainable development and succeeding in global competition. In both of these frames, Helsinki uses imaginaries of ‘Sustainable Helsinki’, but in different ways. In the competitive tone of voice, the delegation of responsibility is more implying and indirect since the focus is on the objective, not the process. In cooperative framing, the imaginaries are more straightforwardly asserting responsibility to people and businesses. My research shows that there are several ways to guide people through place branding, but in Helsinki’s case, the city is appealing to the freedom and independence of its locals.
  • Horkka, Kaisa (2016)
    I. Meyer-Schuster rearrangement Meyer-Schuster rearrangement is an atom economical reaction, in which α,β-unsaturated carbonyl compounds are formed from propargylic alcohols and their derivatives by formal migration of the carbonyl group. Brønsted acids have been used traditionally as stoichiometric catalysts, but more recently, several Lewis acids both salts and metal complexes, have become popular due to their selectivity and mild conditions applied. Also oxo-metal complexes are often used despite the high reaction temperature needed. In this thesis, various manners and catalysts for the Meyer-Schuster rearrangement are reviewed. Reactivity of different substrates and practicality of several types of catalysts are evaluated. In addition, different reaction mechanisms are discussed, as they are highly substrate and catalyst-dependent. Propargylic alcohols have two functional groups, in which the catalysts can coordinate according to their characteristics. Especially gold compounds, which are soft Lewis acids, have gained interest due to their specific coordination ability. II. Synthesis of potential FtsZ inhibitors FtsZ, which is a bacterial homologue for tubulin, is essential in the cell division. It polymerizes into a dynamic ring structure, which constricts to separate the new daughter cells. Being FtsZ protein-directing compounds, 3-methoxybenzamide derivatives have been noticed to inhibit growth of certain bacteria. In this work, a set of molecules was synthesized based on PC190723, which is a 3-methoxybenzamide derivative. In order to study interactions between the protein and the inhibitors, fluorescent groups were attached. Polymer sedimentation tests and fluorescence spectroscopy were used to study shortly the biological behavior of the products. It was discovered that one of the products is a polymer-stabilizing and thus, FtsZ activity inhibiting compound.
  • Miettinen, Nina (2019)
    This master’s thesis is a case study about the perspectives that the mobile youth in the Nairobi region have in their roles in the changing world. The study begins with an assumption that the hegemonic narrative of the modern nation-state is being challenged in the globalizing world. It focuses on the worldviews and identities of the youth who have previously been awarded with a scholarship in the United States via Kennedy-Lugar Youth Exchange and Study (YES) program and currently influence in the Nairobi region. Place and citizenship are used as central theoretical concepts in the study. A multimethod design of qualitative approaches was used to conduct the empirical part of the research. The primary data was collected from the Nairobi region YES alumni by semi-structured interviews, observation and a focus group session. In addition, an expert interview was conducted and public reports regarding the subject of the research were used as secondary data. The data was analyzed with Atlas.ti software, combining coding and qualitative content analysis. The main findings of the study state that the YES alumni are globally oriented mobile and flexible citizens who identify with multiple groups and places. They sense belonging to Kenya but also identify as global citizens. Values that emerge during the research are especially related to learning and experiencing, benevolence and being successful. The youth aim to develop skills that respond to the challenges presented by globalization. In addition, the participants describe the exchange experience in the United States as an important factor that has changed the course of their lives in one way or another. As a conclusion it can be summed that the Nairobi region YES alumni are in a position where adjusting to to the changing world is possible especially due to tertiary education and possessing skills that maybe be applied in transnational expert careers. In the end of the research it is suggested that the groups who do not adjust to the changes as flexibly or whose worldviews and identities do not match in the narratives of globalization and mobility should be acknowledged next.
  • Roiha, Maikki (2024)
    Immersiivisen teknologian käyttö yleistyy opetus- ja koulutusalalla. Niitä hyödyntäviä opetusratkaisuja kehitetään ja tutkitaan enemmän myös luonnontieteiden opetuksessa. Tämän tutkimuksen tavoitteena oli kehittää immersiivinen oppimisympäristö virtuaaliselle opintokäynnille ja tutkia sen avulla synkronisen 360°-videon mahdollisuuksia ja haasteita kemian opetuksessa. Tutkimus toteutettiin kehittämistutkimuksena, jota ohjattiin tutkimuskysymyksellä ”millaisia mahdollisuuksia ja haasteita synkronisella ja immersiivisellä 360°-videolla on kemian virtuaalisen opintokäynnin toteuttamisessa.” Tutkielman teoreettisessa ongelma-analyysissa tarkastellaan 360°-videoiden mahdollisuuksia, haasteita ja hyväksi todettuja toimintatapoja. Taustateoriaa peilataan teknologis-pedagogisen sisältötiedon malliin, minkä perusteella kehitetään ja testataan 360°-videokuvaa hyödyntävä kemian opintokäynti lukiolaisille. Kehittämisen viimeisessä vaiheessa suoritettiin empiirinen ongelma-analyysi tapaustutkimuksena. Tutkimukseen osallistui seitsemän henkilöä, joista viisi osallistui teemahaastatteluun ja kahden toimintaa havainnoitiin heidän työskennellessä kehittämistuotoksen parissa. Haastatteluaineisto analysoitiin laadullisesti teoriaohjaavalla sisällönanalyysilla. Tulosten mukaan immersiivisen ja synkronisen 360°-videon opetuskäytön mahdollisuuksia ovat opetusmenetelmän innostavuus, oppimisen tukeminen, opetuksen saavutettavuuden lisääminen, kemian sisältöjen visuaalinen esittäminen ja kemian relevanssin tukeminen. Immersiivisen oppimisympäristön koetaan tukevan opetukseen keskittymistä ja vahvistavan opetustilanteessa läsnä olemisen tunnetta. Haasteiksi koetaan mahdolliset tekniset haasteet, vuorovaikutuksen autenttisuuden puute, virtuaalisuuden rajoitteet kemian opetuksessa, opetuksen teknologiakeskeisyys sekä teknologian käyttöönottoon vaadittavat resurssit. Tutkimusta voidaan hyödyntää 360°-videota käyttävien oppimisympäristöjen kehittämisessä ja vastaavanlaisten teknologia-avausten opetuskäytön tutkimisessa. Immersiivisen 360°-videon mielekästä luokkahuoneintegrointia varten on huolehdittava, että teknologian opetuskäytön motivaation on lähdettävä opetettavista sisällöistä. Tätä varten opettajien tulee harjoittaa teknologis-pedagogista sisältötietoaan. Jatkotutkimusta tarvitaan immersion oppimisvaikutuksista, 360°-videon vaikutuksesta kemian relevanssin tukemiseen sekä helposti lähestyttävien opetusmateriaalien kehittämisestä. Tutkimus toteutettiin tilaustutkimuksena Nokia Oyj:lle.
  • Manner, Helmiina (2013)
    Pienet molekyylit (hapteenit) eivät normaalisti ole antigeenisiä. Niille voidaan tuottaa vasta-aineita liittämällä ne suureen kantaja-molekyyliin, jolloin elimistön immuunipuolustus tunnistaa ne. Vasta-aineita voidaan käyttää molekyylien tunnistamiseen solutasolla (immunosytokemia). Ligniini on luonnon toiseksi yleisin puupolymeeri, joka tukee ja tilkitsee kasvien soluseiniä. Se on lähinnä kolmesta prekursorista muodostuva fenolinen polymeeri, jonka rakenne on epäsäännöllinen ja vaikeasti tutkittavissa oleva. Ligniinin rakennetta voidaan tutkia immunosytokemiallisesti vasta-aineiden avulla. Kirjallisuuskatsauksessa käydään läpi puun ja ligniinin rakennetta, yleisiä hapteenikonjugaattien synteesimenetelmiä, ligniinirakenteita joille on tuotettu vasta-aineita sekä vasta-aineiden käyttöä erilaisten ligniinirakenteiden paikantamisessa eri kasvilajien soluseinissä. Ligniinirakenteita on havaittu eri kasvilajeilla etenkin soluseinän sekundaariseinämässä. Tutkielman kokeellinen osa keskittyy neljän dimeerisen ligniinin malliyhdisteen (hapteenin) proteiinikonjugaattien synteesiin. Valitut hapteenit edustavat yleisimpiä ligniinin rakenneyksiköitä (β-O-4', β-β', β-5'). Aryyliglyseroli-β-aryylieetterit syntetisoitiin nk. Nakatsubon menetelmää käyttäen. Resinolirakenne (β-β') syntetisoitiin entsymaattisesti. Myös β-5'-rakenne yritettiin syntetisoida entsymaattisesti. Hapteenit liitettiin proteiiniin karboksyylihappolinkkeristä aktiivisen esterin kautta. Valmiit hapteeni-proteiini –konjugaatit analysoitiin MALDI-TOF –massaspektrometrialla. Kolme hapteeni-proteiini –konjugaattia syntetisoitiin onnistuneesti. Fenyylikumaraanirakenteen (β-5') proteiinikonjugaatin synteesi ei onnistunut. Sitä voidaan kokeilla tulevaisuudessa mm. Sonogashira-reaktion avulla.
  • Heikinheimo, Vuokko (2015)
    Land use change refers to the modification of the Earth's surface by humans. Land use/land cover change (in short, land change), especially the clearing of tree cover, is a major source of increased carbon dioxide (CO2) emissions contributing to anthropogenic climate change. In this study, carbon densities and changes in aboveground tree carbon (agc) across different land cover types were mapped in the Taita Hills, Kenya, using field measurements, airborne laser scanning (ALS) data and classified satellite imagery. An existing biomass map was used for retrieving carbon densities for a part of the study area. For the lowland area, another biomass map was created with a regression model based on field measurements of 415 trees on 61 plots and metrics calculated from discrete return ALS data. Explanatory variables in the linear regression model were the standard deviation and 60 % height percentiles of return elevations. Carbon fraction was calculated as 47 % of aboveground biomass. 11 land cover classes were classified from a satellite image with an object-based approach. Overall classification accuracy was 71.1 % with most confusion between the cropland and shrubland classes and shrubland and thicket. Based on the biomass maps, carbon densities were calculated for different land cover classes. Mean carbon densities were 89.0 Mg C ha-1 for indigenous broadleaved forests, 29.0 Mg C ha-1 for plantation forests, 15.6 Mg C ha-1 for woodland, 5.5 Mg C ha-1 for thicket, 3.2 Mg C ha-1 for shrubland, 8.1 Mg C ha-1 for croplands above 1220 meters above sea level (m a.s.l.) and 2.3 Mg C ha-1 for croplands below 1220 m a.s.l.. Land cover maps from 1987, 1992, 2003 and 2011 were used for studying the impact of land change on aboveground carbon stocks. A reduction in carbon storage was observed between years 1987, 1992 and 2003. An increase in total carbon stocks from 2003 to 2011 was observed as a results of increased proportion of woodland, plantation forest and broadleaved forest. These changes should be further verified in a spatially explicit way. More detailed data should be used in order to understand the full complexity of the dynamics between land change and carbon stocks in the heterogeneous landscape of the Taita Hills.
  • Köhler, Daniel (2023)
    Numerical weather prediction models are the backbone of modern weather forecasting. They discretise and approximate the continuous multi-scale atmosphere into computable chunks. Thus, small-scale and complex processes must be parametrised rather than explicitly calculated. This introduces parameters estimated by empirical methods best fit the observed nature. However, the changes to the parameters are changing the properties of the model itself. This work quantifies the impact parameter optimisation has on ensemble forecasts. OpenEPS allows running automated ensemble forecasts in a scientific setting. Here, it uses the OpenIFS model at T255L91 resolution with a 20 min timestep to create 10-day forecasts, which are initialised every week in the period from 1.12.2016 to 30.11.2017. Four different experiments are devised to study the impact on the forecast. The experiments only differ in the parameter values supplied to OpenIFS, all other boundary conditions are held constant. The parameters for the experiments are obtained using the EPPES optimisation tool with different goals. The first experiment minimises the cost function by supplying knowledge regarding the ensemble initial perturbation. The second experiment takes a set of parameters with a worse cost function value. Experiments three and four replicate experiments one and two with the difference that the ensemble initial perturbations are not provided to EPPES. The quality of an ensemble forecast is quantified with a series of metrics. Root mean squared error, spread, and continuous ranked probability score are used with ERA5 reanalysis data as the reference, while the filter likelihood score is providing a direct comparison with observations. The results are summarised in comprehensive scorecards. This work shows that optimising parameters decreases the root mean square error and continuous ranked probability score of the ensemble forecast. However, if the initial perturbations are included in the optimisation the spread of the ensemble is strongly limited. It also could be shown that this effect is reversed if the parameters are tuned with a worse cost function. Nonetheless, when excluding the initial perturbations from the optimisation process, then a better model can be achieved without sacrificing the ensemble spread.
  • Besel, Vitus (2020)
    We investigated the impact of various parameters on new particle formation rates predicted for the sulfuric acid - ammonia system using cluster distribution dynamics simulations, in our case ACDC (Atmospheric Cluster Dynamics Code). The predicted particle formation rates increase significantly if rotational symmetry number of monomers (sulfuric acid and ammonia molecules, and bisulfate and ammonium ions) are considered in the simulation. On the other hand, inclusion of the rotational symmetry number of the clusters only changes the results slightly, and only in conditions where charged clusters dominate the particle formation rate because most of the clusters stable enough to participate in new particle formation display no symmetry, therefore have a rotational symmetry number of one, and the few exceptions to this rule are positively charged. Further, we tested the influence of the application of a quasi-harmonic correction for low-frequency vibrational modes. Generally, this decreases predicted new particle formation rates, and significantly alters the shape of the formation rate curve plotted against the sulfuric acid concentration. We found that the impact of the maximum size of the clusters explicitly included in the simulations depends on the simulated conditions and the errors due to the limited set of clusters simulated generally increase with temperature, and decrease with vapor concentrations. The boundary conditions for clusters that are counted as formed particles (outgrowing clusters) have only a small influence on the results, provided that the definition is chemically reasonable and the set of simulated clusters is sufficiently large. We compared predicted particle formation rates with experimental data measured at the CLOUD (Cosmics Leaving OUtdoor Droplets) chamber. A cluster distribution dynamics model shows improved agreement with experiments when using our new input data and the proposed combination of symmetry and quasi-harmonic corrections., compared to an earlier study based on older quantum chemical data.
  • Deng, Huining (2015)
    This thesis is about learning the globally optimal Bayesian network structure from fully observed dataset, by using score-based method. This structure learning problem is NP- hard, and has attracted the attention of many researchers. We first introduce the necessary background of the problem, then review various score-based methods and algorithms proposed in solving the problem. Parallelization has come under the spotlight during recent years, as it can utilize shared memory and computing power of multi-core supercomputers or computer clusters. We implemented a parallel algorithm Para-OS, which is based on dynamic programming. Experiments were performed in order to evaluate the algorithm. We also propose an improved version of Para-OS, which separates the scoring phase totally from the learning phase, performs score pruning by using Sparse Parent Graph, in addition largely reduces the communication between processors. Empirical results shows the new version saves memory comparing to Para-OS, and provides good runtime with multi-treading.
  • Kivivuori, Eve (2023)
    In this thesis, we discuss the Relative Lempel-Ziv (RLZ) lossless compression algorithm, our implementation of it, and the performance of RLZ in comparison to more traditional lossless compression programs such as gzip. Like the LZ77 compression algorithm, the RLZ algorithm compresses its input by parsing it into a series of phrases, which are then encoded as a position+length number pair describing the location of the phrase within the text. Unlike ordinary LZ77, where these pairs refer to earlier points in the same text and thus decompression must happen sequentially, in RLZ the pairs point to an external text called the dictionary. The benefit of this approach is faster random access to the original input given its compressed form: with RLZ, we can rapidly (in linear time with respect to the compressed length of the text) begin decompression from anywhere. With non-repetitive data, such as the text of a single book, website, or one version of a program's source code, RLZ tends to perform poorer than traditional compression methods, both in terms of compression ratio and in terms of runtime. However, with very similar or highly repetitive data, such as the entire version history of a Wikipedia article or many versions of a genome sequence assembly, RLZ can compress data better than gzip and approximately as well as xz. Dictionary selection requires care, though, as compression performance relies entirely on it.
  • Blomgren, Roger Arne (2022)
    The cloud computing paradigm has risen, during the last 20 years, to the task of bringing powerful computational services to the masses. Centralizing the computer hardware to a few large data centers has brought large monetary savings, but at the cost of a greater geographical distance between the server and the client. As a new generation of thin clients have emerged, e.g. smartphones and IoT-devices, the larger latencies induced by these greater distances, can limit the applications that could benefit from using the vast resources available in cloud computing. Not long after the explosive growth of cloud computing, a new paradigm, edge computing has risen. Edge computing aims at bringing the resources generally found in cloud computing closer to the edge where many of the end-users, clients and data producers reside. In this thesis, I will present the edge computing concept as well as the technologies enabling it. Furthermore I will show a few edge computing concepts and architectures, including multi- access edge computing (MEC), Fog computing and intelligent containers (ICON). Finally, I will also present a new edge-orchestrator, the ICON Python Orchestrator (IPO), that enables intelligent containers to migrate closer to the users. The ICON Python orchestrator tests the feasibility of the ICON concept and provides per- formance measurements that can be compared to other contemporary edge computing im- plementations. In this thesis, I will present the IPO architecture design including challenges encountered during the implementation phase and solutions to specific problems. I will also show the testing and validation setup. By using the artificial testing and validation network, client migration speeds were measured using three different cases - redirection, cache hot ICON migration and cache cold ICON migration. While there is room for improvements, the migration speeds measured are on par with other edge computing implementations.
  • Rintaniemi, Ari-Heikki (2024)
    In this thesis a Retrieval-Augmented Generation (RAG) based Question Answering (QA) system is implemented. The RAG framework is composed of three components: a data storage, a retriever and a generator. To evaluate the performance of the system, a QA dataset is created from Prime minister Orpo's Government Programme. The QA pairs are created by human and also generated by using transformer-based language models. Experiments are conducted by using the created QA dataset to evaluate the performance of the different options to implement the retriever (both traditional algorithmic and transformer-based language models) and generator (transformer-based language models) components. The language model options used in the generator component are the same which were used for generating QA pairs to the QA dataset. Mean reciprocal rank (MRR) and semantic answer similarity (SAS) are used to measure the performance of the retriever and generator component, respectively. The used SAS metric turns out to be useful for providing an aggregated level view on the performance of the QA system, but it is not an optimal evaluation metric for every scenario identified in the results of the experiments. Inference costs of the system are also analysed, as commercial language models are included in the evaluation. Analysis of the created QA dataset shows that the language models generate questions that tend to reveal information from the underlying paragraphs, or the questions do not provide enough context, making the questions difficult to answer for the QA system. The human created questions are diverse and thus more difficult to answer compared to the language model generated questions. The QA pair source affects the results: the language models used in the generator component receive on average high score answers to QA pairs which they had themselves generated. In order to create a high quality QA dataset for QA system evaluation, human effort is needed for creating the QA pairs, but also prompt engineering could provide a way to generate more usable QA pairs. Evaluation approaches for the generator component need further research in order to find alternatives that would provide an unbiased view to the performance of the QA system.
  • Kuronen, Arttu (2023)
    Background: Continuous practices are common in today’s software development and the terms DevOps, continuous integration, continuous delivery and continuous deployment are fre- quently used. While each one of the practices helps in making agile development more agile, using them requires a lot of effort from the development team as they are not only about au- tomating tasks but also about how development should be done. Out of the three continuous practices mentioned above, continuous delivery and deployment focus on the deployability of the application. Implementing continuous delivery or deployment is a difficult task, especially for legacy software that can set limitations on how these practices can be taken into use. Aims: The aim of this study is to design and implement a continuous delivery process in a case project that does not have any type of automation regarding deployments. Method: Challenges of the current manual deployment process were identified and based on the identified challenges, a model continuous delivery process was designed. The identified challenges were also compared to the academic literature on the topic and solutions were taken into consideration when the model was designed. Based on the design, a prototype was created that automates the deploy- ment. The model and the prototype were then evaluated to see how it addresses the previously identified challenges. Results: The model provides a more robust deployment process, and the prototype automates most of the bigger tasks in deployment and provides valuable information about the deployments. However, due to the limitations of the architecture, only some of the tasks could be automated. Conclusions: Taking continuous delivery or deployment into use in legacy software is a difficult task, as the existing software sets a lot of limitations on what can be realistically done. However, the results of this study prove that continuous delivery is achievable to some degree even without larger changes to the software itself.
  • Vuorenkoski, Lauri (2024)
    There are two primary types of quantum computers: quantum annealers and circuit model computers. Quantum annealers are specifically designed to tackle particular problems, as opposed to circuit model computers, which can be viewed as universal quantum computers. Substantial efforts are underway to develop quantum-based algorithms for various classical computational problems. The objective of this thesis is to implement algorithms for solving graph problems using quantum annealer computers and analyse these implementations. The aim is to contribute to the ongoing development of algorithms tailored for this type of machine. Three distinct types of graph problems were selected: all pairs shortest path, graph isomorphism, and community detection. These problems were chosen to represent varying levels of computational complexity. The algorithms were tested using the D-Wave quantum annealer Advantage system 4.1, equipped with 5760 qubits. D-Wave provides a cloud platform called Leap and a Python library, Ocean tools, through which quantum algorithms can be designed and run using local simulators or real quantum computers in the cloud. Formulating graph problems to be solved on quantum annealers was relatively straightforward, as significant literature already contains implementations of these problems. However, running these algorithms on existing quantum annealer machines proved to be challenging. Even though quantum annealers currently boast thousands of qubits, algorithms performed satisfactorily only on small graphs. The bottleneck was not the number of qubits but rather the limitations imposed by topology and noise. D-Wave also provides hybrid solvers that utilise both the Quantum Processing Unit (QPU) and CPU to solve algorithms, which proved to be much more reliable than using a pure quantum solver.
  • Leino, Henrik (2022)
    Low-level wind shear is a significant aviation hazard. A sudden reduction in the headwind along an aircraft flight path can induce a loss of lift, from which an aircraft may not be able to recover when it is close to the ground. Airports therefore use low-level wind shear alert systems to monitor wind velocities within the airport terminal area and alert of any detected hazardous wind shear. There exist three ground-based sensor systems capable of independently observing low-level wind shear: a Doppler weather radar-based, a Doppler wind lidar-based, and an anemometer-based system. However, as no single sensor system is capable of all-weather wind shear observations, multiple alert systems are used simultaneously, and observations from each system are integrated to produce one set of integrated wind shear alerts. Algorithms for integrating Doppler weather radar and anemometer wind shear observations were originally developed in the early 1990s. However, the addition of the Doppler wind lidar-based alert system in more recent years warrants updates to the existing radar/anemometer integration algorithms. This thesis presents four different replacement candidates for the original radar/anemometer integration algorithms. A grid-based integration approach, where observations from different sensor systems are mapped onto a common grid and integrated, is found to best accommodate central integration considerations, and is recommended as the replacement to the original radar/anemometer algorithms in operational use. The grid-based approach is discussed in further detail, and a first possible implementation of the algorithm is presented. In addition, ways of validating the algorithm and adopting it for operational use are outlined.