Browsing by department "Department of Computer Science"

Now showing items 81-100 of 482

Computational framework for systematic and scalable analysis of deep sequencing transcriptomics data

Cervera Taboada, Alejandra (2012)

High-throughput technologies have had a profound impact in transcriptomics. Prior to microarrays, measuring gene expression was not possible in a massively parallel way. As of late, deep RNA sequencing has been constantly gaining ground to microarrays in transcriptomics analysis. RNA-Seq promises several advantages over microarray technologies, but it also comes with its own set of challenges. Different approaches exist to tackle each of the required processing steps of the RNA-Seq data. The proposed solutions need to be carefully evaluated to find the best methods depending on the particularities of the datasets and the specific research questions that are being addressed. In this thesis I propose a computational framework that allows the efficient analysis of RNA-Seq datasets. The parallelization of tasks and organization of the data files was handled by the Anduril framework on which the workflow was implemented. Particular emphasis was bestowed on the quality control of the RNA-Seq files. Several measures were taken to prune the data of low quality bases and reads that hamper the alignment step. Furthermore, various existing processing algorithms for transcript assembly and abundance estimation were tested. The best methods have been coupled together into an automated pipeline that takes the raw reads and delivers expression matrices at isoform and gene level. Additionally, a module for obtaining sets of differentially expressed genes under different conditions or when measuring an experiment across a time course is included.
Computational Identification of Recessive Mutations in Cancers using High Throughput SNP-arrays

Laakso, Marko (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2007)

This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study.
Computational Prediction of ETS-regulated Elements for Prostate Cancer Susceptibility in Human Genome

Ni, Shuai (2015)

Gene expression programs driven by transcription factors (TF) play pivotal roles in both normal cell differentiation and tumorigenesis. However, the links between genomic mutations and transcriptional dysregulation in gene expression profiles are largely unknown. Single nucleotide polymorphism (SNP) caused alterations in DNA-binding affinity of transcription factors are likely to be a major factor in many quantitative trait conditions, including familial predisposition to various types of diseases such as cancer. Therefore, the identification of SNPs that cause gene dysregulation by changing TF DNA-binding specificity may discover potential therapeutic cancer targets. Several transcription factors such as ERG, ETV1 and FLI1 from the ETS (E twenty-six) family are known for their oncogenic functions in prostate cancer and play pivotal roles in malignant transformation and tumor progression. Therefore, in this work, we conducted a comparative genomic study between human and mouse, focusing on the ETS family transcription factor regulated elements and identifying SNPs that can potentially affect ETS family TF DNA-binding to the enhancer elements, which, in turn, may initiate aberrant expression of genes under regulation. This analysis in combination with ChIP-seq data of ETS family members and known risk loci in prostate cancer, we define a novel ETS-regulated enhancer element that may confer prostate cancer risk. Through targeted resequencing of a 8449 bp region in 184 cases and 188 controls, we identify several novel SNPs that may impact on gene regulation underpinning prostate cancer susceptibility.
Computational support for game design ideation

Kruglaia, Anna (2016)

Game design is a complicated, multifaceted creative process. While there are tools for developing computer games, tools that could assist with more abstract creative parts of the process are underrepresented in the domain. One of such parts is the generation of game ideas. Ideation (idea generation) is researched by the computational creativity community in the contexts of design, story and poetry generation, music, and others. Some of the existing techniques can be applied to ideation for games. The process of generating ideas was investigated by applying said techniques to actual themes from game jams. The possibility of using metaphors produced by Metaphor Magnet together with ConceptNet was tested and the results are presented as well.
Configuration Tool and Experimental Platform for Pointing Devices

Jin, Jiawei (2014)

In user studies of human-computer interaction, experiments on new devices and techniques are often made on experiment software, which is developed separately for each device and technique. A systematic experimental platform, capable of running experiments on a number of technologies, would facilitate the design and implementation of such experiments. To do this, a configurable framework was created to allow relative pointing and absolute pointing input to be enhanced with adaptive pointing and smoothed pointing techniques. This thesis discusses both the internals of the framework as well as how a platform is developed based on the framework. Additionally, two calibration modules were designed to transform the relative pointing input to absolute pointing and obtain the necessary parameters which will be applied in smoothed pointing. As a part of the deployment, the experiment module was made to provide a platform which allowed the enhanced pointing experience to be evaluated and generated proper output according to the results of the experiment task. One key achievement presented in this thesis is that the relative pointing devices are integrable with adaptive pointing and smoothed pointing which support for absolute pointing devices in general. Another key result presented in this thesis is that the configurable framework based experimental platform provides proper functions which meet the demands of professional pointing evaluation. ACM Computing Classification System (CCS): I.4.1 [Digitization and Image Capture]: Camera calibration, I.4.3 [Enhancement]: Smoothing, I.4.8 [Scene Analysis]: Tracking
Congestion Control and Scheduling in Multipath TCP

Wang, Lei (2017)

Nowadays many hosts have more than one network interface. For example, mobile smartphones are generally equipped with cellular network and WiFi network interfaces. If users can utilize all available network interfaces simultaneously, it allows a potential improvement in terms of network redundancy and performance. To use multiple network interfaces simultaneously, many multi- path communication networking protocols are proposed. At the transport layer, Multipath TCP (MPTCP) is a protocol for multipath communication, which in fact is an extension to regular TCP. MPTCP allows to spread the traffic onto several TCP subflows which take different network paths. Its advantage is the ability to balance the load, improve the connection resilience in case of path failure and maximize connection throughput. Congestion control and packet scheduling are two important components for MPTCP design. Congestion control is in charge of controlling induced network load, while packet scheduling is responsible for the distribution of data over multiple paths and improper scheduling decisions might introduce higher delay. Therefore, congestion control algorithms and packet schedulers are two components which greatly impact the performance of MPTCP. Four MPTCP congestion control algorithms, that is, Cubic congestion control algorithm, linked increases algorithm (LIA), opportunistic linked increase algorithm (OLIA) and wVegas algorithm, and two schedulers, that is, round robin (RR) scheduler and lowest RTT first (LowestRTT) scheduler, are deeply studied in the thesis. Furthermore, we design experiments to evaluate and compare the different MPTCP congestion control algorithms and packet schedulers. The experimental evaluation results show that Cubic congestion control algorithm could achieve highest aggregate throughput of all, but is less fair to competing flows than the others. OLIA can achieve similar aggregate throughput to LIA, but is more fair and responsive than LIA. OLIA is also more stable and responsive than LIA, while wVegas is unstable in terms of responsiveness to network changes. When compared with OLIA and LIA, wVegas behaves similar to OLIA but better than LIA in terms of fairness to competing TCP flows. As for two MPTCP schedulers, the results show that RR scheduler has a lower delay jitter than LowestRTT scheduler in rate limited traffic mode. Besides, LIA and OLIA have a lower delay jitter with LowestRTT scheduler than wVegas has.
Content-Centric Networking in the Internet of Things

Waltari, Otto Kustaa (2013)

Advanced low-cost wireless technologies have enabled a huge variety of real life applications in the past years. Wireless sensor technologies have emerged in almost every application field imaginable. Smartphones equipped with Internet connectivity and home electronics with networking capability have made their way to everyday life. The Internet of Things (IoT) is a novel paradigm that has risen to frame the idea of a large scale sensing ecosystem, in which all possible devices could contribute. The definition of a thing in this context is very vague. It can be anything from passive RFID tags on retail packaging to intelligent transducers observing the surrounding world. The amount of connected devices in such a worldwide sensing network would be enormous. This is ultimately challenging for the current Internet architecture which is several decades old and is based on host-to-host connectivity. The current Internet addresses content by location. It is based on point-to-point connections, which eventually means that every connected device has to be uniquely addressable through a hostname or an IP address. This paradigm was originally designed for sharing resources rather than data. Today the majority of Internet usage consists of sharing data, which is not what it was originally designed for. Various patchy improvements have come and gone, but a thorough architectural redesign is required sooner or later. Information-Centric Networking (ICN) is a new networking paradigm that addresses content by name instead of location. Its goal is to replace the current where with what, since the location of most content on the Internet is irrelevant to the end user. Several ICN architecture proposals have emerged from the research community, out of which Content-Centric Networking (CCN) is the most significant one in the context of this thesis. We have come up with the idea of combining CCN with the concept of IoT. In this thesis we look at different ways on how to make use of the hierarchical CCN content naming, in-network caching and other information-centric networking characteristics in a sensor environment. As a proof of concept we implemented a presentation bridge for a home automation system that provides services to the network through CCN.
Content Monitoring in BitTorrent Systems

Tilli, Tuomo (2012)

BitTorrent is one of the most used file sharing protocols on the Internet today. Its efficiency is based on the fact that when users download a part of a file, they simultaneously upload other parts of the file to other users. This allows users to efficiently distribute large files to each other, without the need of a centralized server. The most popular torrent site is the Pirate Bay with more than 5,700,000 registered users. The motivation for this research is to find information about the use of BitTorrent, especially on the Pirate Bay website. This will be helpful for system administrators and researchers. We collected data on all of the torrents uploaded to the Pirate Bay from 25th of December, 2010 to 28th of October, 2011. Using this data we found out that a small percentage of users are responsible for a large portion of the uploaded torrents. There are over 81,000 distinct users, but the top nine publishers have published more than 16% of the torrents. We examined the publishing behaviour of the top publishers. The top usernames were publishing so much content that it became obvious that there are groups of people behind the usernames. Most of the content published is video files with a 52% share. We found out that torrents are uploaded to the Pirate Bay website at a fast rate. About 92% of the consecutive uploads have happened within 100 seconds or less from each other. However, the publishing activity varies a lot. These deviations in the publishing activity may be caused by down time of the Pirate Bay website, fluctuations in the publishing activity of the top publishers, national holidays or weekdays. One would think that the publishing activity with so many independent users would be quite level, but surprisingly this is not the case. About 85% of the files of the torrents are less than 1.5 GB in size. We also discovered that torrents of popular feature films were uploaded to the Pirate Bay very fast after their release and the top publishers appear to be competing on who releases the torrents first. It seems like the impact of the top publishers is quite significant in the publishing of torrents.
Continuous and Energy-Efficient Transportation Behavior Monitoring

Hemminki, Samuli (2012)

In this thesis we present and evaluate a novel approach for energy-efficient and continuous transportation behavior monitoring for smartphones. Our work builds on a novel adaptive hierarchical sensor management scheme (HASMET), which decomposes the classification task into smaller subtasks. In comparison to previous work, our approach improves the task of transportation behavior monitoring on three aspects. First, by employing only the minimal set of necessary sensors for each subtask, we are able to significantly reduce power consumption of the detection task. Second, using the hierarchical decomposition, we are able to tailor features and classifiers for each subtask, improving the accuracy and robustness of the detection task. Third, we are able to extend the detectable motorised modalities to cover most common public transportation vehicles. All of these attributes are highly desirable for real-time transportation behavior monitoring and serve as important steps toward implementing the first truly practical transportation behavior monitoring on mobile phones. In the course of the research, we have developed an Android application for sensor data collection and utilized it to collect over 200 hours of transportation data, along with 2.5 hours of energy consumption data of the sensors. We apply our method on the data to demonstrate that compared to current state-of-art, our method offers higher detection accuracy, provides more robust transportation behavior monitoring and achieves significant reduction in power consumption. For evaluating results with respect to the continuous nature of the transportation behavior monitoring, we use event and frame-based metrics presented by Ward et al.
Contract-based robustness test case generation of web services in SOA ecosystems

Wang, Sai (2015)

Robustness testing is an important aspect in web service testing. It focuses on the service's ability to deal with invalid input. Therefore, the test cases of robustness testing aims at good coverage on input conditions. Behaviours of participate services are described in BPEL contract. Services communicate with each other by sending SOAP messages. BPEL process is seen as a graph with nodes and edges which stand for activities and messages. Due to the feature of business process, we extend the robustness of web services in SOA ecosystems based on the traditional robustness definition. The robustness test case generation focuses on test paths or message sequences generation and test data in SOAP messages generation. Web service contract contains information related to test case generation. In this thesis, we divide the contracts into three levels: document level contract, model level contract and implementation level contract. Model level contract provides the information for test case generation. BPEL contract helps test paths generation and WSDL contract helps test data generation. By analysing the contents in contract, test cases can be generated. Petri net and graph-based method are chosen as a method for message sequences generation. Data perturbation technology is used for invalid test data generation.
Course outcome prediction with transfer learning methods

Lagus, Jarkko (2016)

In computer science, introductory programming course is one of the very first courses taken. It sets the base for more advanced courses as programming ability is usually assumed there. Finding the students that are likely to fail the course allows early intervention and more focused help for them. This can potentially lower the risk of dropping out in later studies, because of the lack of fundamental skills. One measure for programming ability is the outcome of a course and the prediction of these outcomes is the focus also in this thesis. In educational context, differences between courses set huge challenges for traditional machine learning methods as they assume identical distribution in all data. Data collected from different courses can have very different distributions as there are many factors that can change even between consecutive courses such as grading, contents, and platform. To address this challenge transfer learning methods can be used to as they make no such assumption about the distribution. In this thesis, one specific transfer learning algorithm, TrAdaBoost, is evaluated against selection of traditional machine learning algorithms. Methods are evaluated using real-life data from two different introductory programming courses, where contents, participants and grading differ. Main focus is to see how these methods perform in the first weeks of the course that are educationally the most critical moments.
Customer-oriented data storages in cloud computing

Rasooli Mavini, Zinat (2014)

Massive improvements of the services in the public cloud provide many opportunities for online users. One of the most valuable services of this virtual place is the infrastructure to store data in distributed storages. The public cloud storages let different organizations and enterprises to use the high availability of data, in a cost efficient way, with lowered maintenance burden. However, utilizing the large scale capacity of (public) cloud storages is not mature trend yet among the individual customers, businesses, and organizations. The cloud storages are still unreliable places for the sensitive and confidential information or back-up copies from trust and privacy perspective. Hence, some public and private organizations, universities, as well as ordinary citizens avoid uploading their critical files to the cloud. The thesis suggests the idea of customer-oriented data storages as a solution to the shortcomings of public cloud storages. This idea would be a new way to customize the cloud storages which bears more involvement of the customer in managing aspects, as a solution to the current distrust issue on the cloudbased storages and would be a great courage to different type of customers. Furthermore, the thesis evaluates feasibility of the proposed customer-oriented cloud storage architecture based on scenarios inspired from the Architecture Tradeoff Analysis Method (ATAM) evaluation approach. Results of the evaluating discussion on the proposed solution in boosting trust in cloud storages and providing more control for cloud storage customers are presented.
Data-driven Language Typology

Hinkka, Atte (2018)

In this thesis we use statistical n-gram language models and the perplexity measure for language typology tasks. We interpret the perplexity of a language model as a distance measure when the model is applied on a phonetic transcript of a language the model wasn't originally trained on. We use these distance measures for detecting language families, detecting closely related languages, and for language family tree reproduction. We also study the sample sizes required to train the language models and make estimations on how large corpora are needed for the successful use of these methods. We find that trigram language models trained from automatically transcribed phonetic transcripts and the perplexity measure can be used for both detecting language families and for detecting closely related languages.
Data Gathering in Digital Homes

Ray, Debarshi (2012)

Pervasive longitudinal studies in people's intimate surroundings involve gathering data about how people behave in their various places of presence. It is hard to be fully pervasive as it has traditionally required sophisticated instrumentation that may be difficult to acquire and prohibitively expensive. Moreover, setting up such an experiment is laborious. We present a system, in the form of its requirements, design and implementation, that is primarily aimed at collecting data from people's homes. It aims to be as pervasive as possible, and can collect data about a family in the form of audio and video feed from microphones and cameras, network logs and home appliance (eg., TV) usage patterns. The data is then transported over the Internet to a server placed in the close proximity of the researcher, while protecting it from unauthorised access. Instead of instrumenting the test subjects' existing devices, we build our own integrated appliance which is to be placed inside their houses, and has all the necessary features for data collection and transportation. We build the system using cheap off-the-shelf commodity hardware and free and open source software, and evaluate different hardware and software configurations to see how well they can be integrated and how performant or reliable they are in real life scenarios. Finally, we demonstrate a few simple techniques that can be used to analyze the data to gain some insights into the behaviour of the participants.
Datamusikalisaatio

Tulilaulu, Aurora (2017)

Pro gradu -tutkielmassani esittelen datan perusteella ohjattavaa automaattista säveltämistä eli datamusikalisaatiota. Datamusikalisaatiossa on kyse datasta löytyvien muuttujien kuulumisesta automaattisesti sävelletyssä musiikissa. Tarkoitus olisi, että musiikki toimisi korville tarkoitetun visualisaation tavoin havainnollistamaan valittuja attribuutteja datasta. Erittelen tutkielmassa erilaisia tapoja, miten sonifikaatiota ja automaattista tai koneavustettua säveltämistä on tehty aikaisemmin sekä millaisia sovelluksia niillä on. Käyn läpi yleisimmät käytetyt tavat generoida musiikkia, kuten tyypillisimmät stokastiset menetelmät, kieliopit ja koneoppimiseen perustuvat menetelmät. Kerron myös lyhyesti sonifikaatiosta eli datan suorasta kuvaamisesta äänisignaalina ilman musiikillista elementtiä. Kommentoin erilaisten menetelmien vahvuuksia ja heikkouksia. Käsittelen lyhyesti myös sitä, mihin asti automatisoidussa säveltämisessä ja sen uskottavuudessa ihmisarvioijien silmissä on pisimmillään päästy. Käytän esimerkkinä muutamia erilaisia tunnustusta saaneita säveltäviä ohjelmia. Käsittelen kahta erilaista tekemääni musikalisaatio-ohjelmaa. Ensimmäinen generoi kappaleita tiivistäen käyttäjän yhdestä nukutusta yöstä kerätyn datan neljästä kahdeksaan minuuttia kestävään kappaleeseen. Toinen tekee musiikkia reaaliaikaisesti ja muutettavien parametrien pohjalta, jolloin sen pystyy kytkemään toiseen ohjelmaan, joka analysoi dataa ja muuttaa parametreja. Käsitellyssä esimerkissä musiikki tuotetaan keskustelulokin pohjalta ja esimerkiksi keskustelun sävy ja nopeus vaikuttavat musiikkiin. Käyn läpi tekemieni ohjelmien periaatteet musiikin generoimiselle. Käsittelen myös tehtyjen päätösten syitä käyttäen musiikin teorian ja säveltämisen perusteita. Selitän, millaisilla periaatteilla käytetty data kuuluu tai voidaan saada kuulumaan musiikissa, eli miten musikalisaatio eroaa tavallisesta konesäveltämisestä ja sonifikaatiosta, sekä miten se asettuu näiden kahden jo olemassa olevan tutkimuskentän rajoille. Lopuksi esittelen myös käyttäjäkokeiden tulokset, joissa käyttäjiä on pyydetty arvioimaan keskustelulokeista tehdyn musikalisaation toimivuutta, ja pohdin saatujen tulosten ja alan nykytilan pohjalta musikalisaation mahdollisia sovelluskohteita ja mahdollista tulevaa tutkimusta, jota aiheesta voisi tehdä.
Deep Groundwater Metagenomics : Computational Analysis of Microbial Communities and Metabolic Pathways

Althermeler, Nicole (2016)

Metagenomics promises to shed light on the functioning of microbial communities and their surrounding ecosystem. In metagenomic studies the genomic sequences of a collection of microorganisms are directly extracted from a specific environment. Up to 99% of microbes cannot be cultivated in the lab; thus, traditional analysis techniques have very limited applicability in this challenging setting. By directly extracting the sequences from the environment, metagenomic studies circumvents this dilemma. Thus, metagenomics has become a powerful tool in the analysis of the diversity and metabolic capability of environmental microbes. However, metagenomic studies have challenges of their own. In this thesis we investigate several aspects of metagenomic data set analysis, focusing on means of (1) verifying adequacy of taxonomic unit and enzyme representation and annotation in the sample, (2) highlighting similarities between samples by principal component analysis, (3) visualizing metabolic pathways with manually drawn metabolic maps from the Kyoto Encyclopedia of Genes and Genomes, and (4) estimating taxonomic distributions of pathways with a novel strategy. A case study of deep bedrock groundwater metagenomic samples will illustrate these methods. Water samples from boreholes, up to 2500 meter deep, of two different sites of Finland display the applicability and limitations of aforementioned methods. In addition publicly available metagenomic and genomic samples serve as baseline references. Our analysis resulted in a taxonomic and metabolic characterization of the samples. We were able to adequately retrieve and annotate the metabolic content based on the deep bedrock samples. The visualization provided a tool for further investigation. The microbial community distribution could be characterized on higher levels of abstraction. Previously suspected similarities to fungi or archaea were not verified. First promising results were observed with the novel strategy in estimating taxonomic distributions of pathways. Further results can be found at: http://www.cs.helsinki.fi/group/urenzyme/deepfun/
Defense mechanisms against attacks on reputation systems

Qian, Yuchen (2013)

Collaborative applications like online markets, social network communities, and P2P file sharing sites are in popular use these days. However, in an environment where entities have had no interaction before, it is difficult to make trust decisions on a new transacting partner. To predict the quality of further interactions, we need a reputation system to help establish trust relationships. Meanwhile, motivated by financial profit or personal gain, attackers emerge to manipulate the reputation systems. An attacker may aim to slander others, promote oneself, or undermine the whole reputation system. Vulnerable components in a reputation system might bring in potential threats which might be taken advantage of by attackers. In order to give an accurate reputation estimate and better user satisfaction, a reputation system should properly reflect the behavior of the participants and should be difficult to manipulate. To resist attacks, there are various defense mechanisms, for example, mitigating the generation and spreading of false rumors, reasonably assigning an initial reputation value to newcomers, and gradually discounting old behavior. However, each defense mechanism has limitations. There is no perfect defense mechanism which can resist all attacks in all environments without any trade-offs. As a result, to make a reputation system more robust, we need to analyze its vulnerabilities and limitations, and then utilize the corresponding defense mechanisms into it. This thesis conducts a literature survey on reputation systems, inherent vulnerabilities, different kinds of attack scenarios, and defense mechanisms. It discusses the evolution of attacks and defense mechanisms, evaluates various defense mechanisms, and proposes suggestions on how to utilize defense mechanisms into reputation systems.
Describing and validating functional requirements as use cases

Keskioja, Sanna (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2007)

Requirements engineering is an important phase in software development where customer's needs and expectations are transformed into a software requirements specification. The requirements specification can be considered as an agreement between the customer and the developer where both parties agree on the expected system features and behaviour. However, requirements engineers must deal with a variety of issues that complicate the requirements process. The communication gap between the customer and the developers is among typical reasons for unsatisfactory requirements. In this thesis we study how the use case technique could be used in requirements engineering in bridging the communication gap between the customer and development team. We also discuss how a use case description can be use cases can be used as a basis for acceptance test cases.
Design and Implementation of a Prototype Tool for Continuous Experimentation

Bakharzy, Mohammad (2014)

In the new era of digital economy, agility and the ability to adapt to market changes and customers' needs is crucial for sustainable competitiveness. It is vital to identify and consider customers' and users' needs in order to make fact-driven decisions and evaluate assumptions and hypotheses before actually allocating resources to them. Understanding the customers' needs and delivering valuable products or services based on deep customer insight, demands Continuous Experimentation. Continuous Experimentation refers to collecting customers' and users' feedback constantly and understand the real value of product and services to test new ideas and hypothesis as early as possible with minimum resource allocation. Experimentation requires a technical infrastructure including tools, methods, processes, interfaces and APIs to collect, store, visualize and analyze the data. This thesis analyses the state of the practice and state of the art regarding current tools with functionalities that support or might support continuous experimentation. The results of this analysis is a set of problems identified for current tools as well as a set of requirements to be fulfilled for tackling those problems. Among the problems, customizability of the tools to meet the needs of different companies and scenarios is of utmost importance. The lack of customizability in current tools offered companies to allocate their resources to develop their own proprietary tools tailored for their custom needs. Based on requirements that support better customizability, a prototype tool that supports continuous experimentation has been designed and implemented. The support of the tool is evaluated in a real-world scenario with respect to the requirements and customizability issue.
Designing interfaces for exploratory content based image retrieval systems

Hore, Sayantan (2015)

Content Based Image Retrieval or CBIR systems have become the state of the art image retrieval technique over the past few years. They showed commendable retrieval performance over traditional annotation based retrieval. CBIR systems use relevance feedback as input query. CBIR systems developed so far did not put much effort to come up with suitable user interfaces for accepting relevance feedback efficiently i.e. by putting less cognitive load to the user and providing a higher amount of exploration in a limited amount of time. In this study we propose a new interface 'FutureView' which allows peeking into the future providing access to more images in less time than traditional interfaces. This idea helps the user to choose more appropriate images without getting diverted. We used Gaussian process upper confidence bound algorithm for recommending images. We successfully compared this algorithm with Random and Exploitation algorithms with positive results.

Now showing items 81-100 of 482

Browsing by department "Department of Computer Science"

Yhteystiedot

HELSINGIN YLIOPISTO