Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by discipline "Computer science"

Sort by: Order: Results:

  • Ruottu, Toni (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2011)
    As the virtual world grows more complex, finding a standard way for storing data becomes increasingly important. Ideally, each data item would be brought into the computer system only once. References for data items need to be cryptographically verifiable, so the data can maintain its identity while being passed around. This way there will be only one copy of the users family photo album, while the user can use multiple tools to show or manipulate the album. Copies of users data could be stored on some of his family members computer, some of his computers, but also at some online services which he uses. When all actors operate over one replicated copy of the data, the system automatically avoids a single point of failure. Thus the data will not disappear with one computer breaking, or one service provider going out of business. One shared copy also makes it possible to delete a piece of data from all systems at once, on users request. In our research we tried to find a model that would make data manageable to users, and make it possible to have the same data stored at various locations. We studied three systems, Persona, Freenet, and GNUnet, that suggest different models for protecting user data. The main application areas of the systems studied include securing online social networks, providing anonymous web, and preventing censorship in file-sharing. Each of the systems studied store user data on machines belonging to third parties. The systems differ in measures they take to protect their users from data loss, forged information, censorship, and being monitored. All of the systems use cryptography to secure names used for the content, and to protect the data from outsiders. Based on the gained knowledge, we built a prototype platform called Peerscape, which stores user data in a synchronized, protected database. Data items themselves are protected with cryptography against forgery, but not encrypted as the focus has been disseminating the data directly among family and friends instead of letting third parties store the information. We turned the synchronizing database into peer-to-peer web by revealing its contents through an integrated http server. The REST-like http API supports development of applications in javascript. To evaluate the platform s suitability for application development we wrote some simple applications, including a public chat room, bittorrent site, and a flower growing game. During our early tests we came to the conclusion that using the platform for simple applications works well. As web standards develop further, writing applications for the platform should become easier. Any system this complex will have its problems, and we are not expecting our platform to replace the existing web, but are fairly impressed with the results and consider our work important from the perspective of managing user data.
  • Markova, Laura (2015)
    Tämän pro gradu –tutkielman aihe on digitaalisten oikeuksien hallinta (DRM) yhteistyöverkostoissa, tarkemmin rajattuna omistajuuden jäljitettävyys. Tutkielma on vertaileva kirjallisuuskatsaus suurien yhteistyöverkostojen käyttöön sopivista DRM-tekniikoista. Se pyrkii vastaamaan kysymykseen, millaisia eri tekniikoita on olemassa digitaalisen sisällön oikeuksien hallintaan, ja mikä tai mitkä niistä soveltuvat erilaisiin yhteistyöverkostoihin. Tutkielman tuloksena nähdään, että parhaisiin tuloksiin päästään eri DRM-tekniikoita yhdistämällä. Perinteistä salaamista, vesileimoja ja sormenjälkiä, sekä laitteistoihin lisättäviä tietoturvaominaisuuksia voidaan hyödyntää parhaan lopputuloksen saamiseksi. Kaikkia näitä suojaustekniikoita kehitetään jatkuvasti, ja pyritään saamaan niistä yhä turvallisempia ja vankempia eri salausavaimia, -tyylejä ja salauskertojakin käyttämällä. Näiden suojaustekniikoiden monipuolinen käyttö johtaakin luotettavimpaan ratkaisuun, kun digitaalista sisältöä halutaan jakaa yhteistyöverkostoissa.
  • Piela, Riitta (2018)
    Ohjelmistointensiivisten tuotteiden määrä on jatkuvassa kasvussa. Tarkkaa tilastotietoa ohjelmistointensiivisten tuotteiden osuudesta tai toimialan kasvusta on kuitenkin vaikea tuottaa, koska organisaation päätoimiala määrittää, mille toimialalle mahdollinen talouskasvu merkitään. Yleinen arvio on kuitenkin, että yhä useampi markkinoilla oleva tuote tai palvelu pitää sisällään digitaalisia komponentteja. Tässä tutkielmassa tarkastellaan digitalisaatiota yhteiskunnallisena ilmiönä, jonka vaikutukset näkyvät organisaatioille joskus hyvinkin äkkinäisinä asiakastarpeiden muutoksina. Digitalisaation kiihdyttämässä kilpailutilanteessa palvelujen ja tuotteiden on vastattava asiakkaiden tarpeisiin sekä mieltymyksiin entistä paremmin ja nopeammin. Tämän tutkielman tarkoituksena on, kirjallisuuskatsauksen muodossa, tunnistaa niitä avaintekijöitä, jotka ovat organisaatioiden kilpailukyvyn kannalta keskeisiä. Ohjelmistoprosessilla on keskeinen rooli organisaatioiden kilpaillessa markkinoista. Tutkielmassa käsitellään myös IT-organisaation roolia uusien innovaatioiden sekä liiketoimintamallien kehittämisessä. IT-organisaatio käsitteenä on tässä tutkielmassa joko suuremman organisaation oma IT-yksikkö tai itsenäinen IT-toimija, jonka palveluja toinen organisaatio hyödyntää. Organisaatiolla tarkoitetaan yksityisen tai julkisen sektorin toimijaa. Horisontaalinen johtaminen, organisaation kyvykkyys muutokseen sekä henkilöstön jatkuvan oppisen malli muodostavat perustan organisaation kyvylle vastata kilpailuun. Organisaation kilpailukyvyn kannalta tärkeää on myös, että ohjelmistoprosessi toimii osana organisaation liiketoiminnan kehittämistä. Näin mahdollistetaan uusien innovaatioiden sekä niiden avulla myös aivan uusien liiketoimintamallien synty; digitaalinen transformaatio. Ohjelmistoprosessin kannalta tärkeää on, että sen avulla pystytään tuottamaan asiakkaille nopeasti heitä miellyttäviä tuotteita/palveluja. Asiakkaan parempi tunnistaminen sekä aktivoiminen osaksi ohjelmistoprosessia ovat myös tärkeitä tekijöitä onnistuneiden tuotteiden ja palveluiden saamiseksi markkinoille. ACM Computing Classification System (CCS): [Software and its engineering~Software creation and management] [Applied computing~Business-IT alignment] [Applied computing~IT governance]
  • Singh, Maninder Pal (2016)
    Research in healthcare domain is primarily focused on diseases based on the physiological changes of an individual. Physiological changes are often linked to multiple streams originated from different biological systems of a person. The streams from various biological systems together form attributes for evaluation of symptoms or diseases. The interconnected nature of different biological systems encourages the use of an aggregated approach to understand symptoms and predict diseases. These streams or physiological signals obtained from healthcare systems contribute to a vast amount of vital information in healthcare data. The advent of technologies allows to capture physiological signals over the period, but most of the data acquired from patients are observed momentarily or remains underutilized. The continuous nature of physiological signals demands context aware real-time analysis. The research aspects are addressed in this thesis using large-scale data processing solution. We have developed a general-purpose distributed pipeline for cumulative analysis of physiological signals in medical telemetry. The pipeline is built on the top of a framework which performs computation on a cluster in a distributed environment. The emphasis is given to the creation of a unified pipeline for processing streaming and non-streaming physiological time series signals. The pipeline provides fault-tolerance guarantees for the processing of signals and scalable to multiple cluster nodes. Besides, the pipeline enables indexing of physiological time series signals and provides visualization of real-time and archived time series signals. The pipeline provides interfaces to allow physicians or researchers to use distributed computing for low-latency and high-throughput signals analysis in medical telemetry.
  • Laukkanen, Janne Johannes (2018)
    The vast amount of data created in the world today requires an unprecedented amount of processing power to be turned into valuable information. Importantly, more and more of this data is created on the edges of the Internet, where small computers, capable of sensing and controlling their environments, are producing it. Traditionally these so-called Internet of Things (IoT) devices have been utilized as sources of data or as control devices, and their rising computing capabilities have not yet been harnessed for data processing. Also, the middleware systems that are created to manage these IoT resources have heterogeneous APIs, and thus cannot communicate with each other in a standardized way. To address these issues, the IoT Hub framework was created. It provides a RESTful API for standardized communication, and includes an execution engine for distributed task processing on the IoT resources. A thorough experimental evaluation shows that the IoT Hub platform can considerably lower the execution time of a task in a distributed IoT environment with resource constrained devices. When compared to theoretical benchmark values, the platform scales well and can effectively utilize dozens of IoT resources for parallel processing.
  • Mäki, Jussi Olavi Aleksis (2013)
    With the increasing growth of data traffic in mobile networks there is an ever growing demand from the operators for a more scalable and cost efficient network core. Recent successes in deploying Software-Defined Networking (SDN) in data centers and large network backbones has given it credibility as a viable solution for meeting the requirements of even the large core networks. Software-Defined Networking is a novel new paradigm where the control logic of the network is separated from the network elements into logically centralized controllers. This separation of concerns offers more flexibility in network control and makes writing of new management applications, such as routing protocols, easier, faster and more manageable. This thesis is an empirical experiment in designing and implementing a scalable and fault- tolerant distributed SDN controller and management application for managing the GPRS Tunneling Protocol flows that carry the user data traffic within the Evolved Packet Core. The experimental implementation is built using modern open-source distributed system tools such as the Apache Zookeeper distributed coordination service and Basho's Riak distributed key-value database. In addition to the design, a prototype implementation is presented and its performance is evaluated.
  • Mäkinen, Simo (2012)
    Test-driven development is a software development method where programmers compose program code by first implementing a set of small-scale tests which help in the design of the system and in the verification of associated code sections. The reversed design and implementation process is unique: traditionally there is no attempt to verify program code that does not yet exist. Applying practices of test-driven design to a software development process-a generally complex activity involving distinct individuals working in an organization-might have an impact not only on the process itself but on the outcome of the process as well. In order to assess whether test-driven development has perceivable effects on elements of software development, a qualitative literature survey, based on empirical studies and experiments in the industry and academia, was performed. The aggregated results extracted from the studies and experiments on eleven different internal and external process, product and resource quality attributes indicate that there are positive, neutral and negative effects. Empirical evidence from the industry implies that test-driven development has a positive, reducing, effect on the number of defects detected in a program. There is also a chance that the code products are smaller, simpler and less complex than equivalent code products implemented without test-driven practices. While additional research is needed, it would seem that the test-driven produced code is easier for the developers to maintain later, too; on average, maintenance duties took less time and the developers felt more comfortable with the code. The effects on product attributes of coupling and cohesion, which describe the relationships between program code components, are neutral. Increased quality occasionally results in better impressions of the product when the test-driven conform better to the end-user tests but there are times when end-users cannot discern the differences in quality between products made with different development methods. The small, unit-level, tests written by the developers increase the overall size of code products since most of the program code statements are covered by the tests if a test-driven process is followed. Writing tests takes time and the negative effects are associated with the effort required in the process. Industrial case studies see negative implications to productivity due to the extra effort but student experiments have not always been able to replicate similar results under controlled conditions.
  • Kostiainen, Nikke (2017)
    Peliäänet ovat pelaajalle tarkoitettuja ääniä, joita peliohjelmisto toistaa suorituksen aikana. Työssä tarkastellaan ja arvioidaan erilaisten peliäänien kehityksen piirteitä ja vertaillaan mahdollisia ratkaisumenetelmiä keskenään. Työssä hyödynnettiin kirjallisia lähteitä ja työn osana toteutettiin yksinkertainen peliäänimoottori. Moottorin käytön helppoutta eri skenaarioissa arvioitiin sanallisesti. Vertailu toteutusmentelmien välillä suoritettiin käyttäen lähteistä löytyvää tietoa, sekä kokemuspohjaa olemassa olevien ratkaisujen käytöstä ja peliäänimoottorin toteutuksesta. Vertailussa todettiin eri ratkaisumenetelmien sopivan tiettyyn pelinkehitystilanteisiin paremmin ja toisiin huonommin. Itse kehitettävät pelimoottoriratkaisut sopivat hyvin tilanteisiin, joissa kehitettävä alusta on suorituskyvyltään rajattu tai peliäänimoottorin vaatimukset edellyttävät toimintoja, joita olemassa olevissa ratkaisuissa ei ole. Vastaavasti olemassaolevat ratkaisut sopivat hyvin suurempiin projekteihin, joissa peliäänien avulla tavoitellaan realistisuutta
  • Nyman, Thomas (2014)
    Operating System-level Virtualization is virtualization technology based on running multiple isolated userspace instances, commonly referred to as containers, on top of a single operating system kernel. The fundamental difference compared to traditional virtualization is that the targets of virtualization in OS-level virtualization are kernel resources, not hardware. OS-level virtualization is used to implement Bring Your Own Device (BYOD) policies on contemporary mobile platforms. Current commercial BYOD solutions, however, don't allow for applications to be containerized dynamically upon user request. The ability to do so would greatly improve the flexibility and usability of such schemes. In this work we study if existing OS-level virtualization features in the Linux kernel can meet the needs of use cases reliant on such dynamic isolation. We present the design and implementation of a prototype which allows applications in dynamic isolated domains to be migrated from one device to another. Our design fits together with security features in the Linux kernel, allowing the security policy influenced by user decisions to be migrated along with the application. The deployability of the design is improved by basing the solution on functionality already available in the mainline Linux kernel. Our evaluation shows that the OS-level virtualization features in the Linux kernel indeed allow applications to be isolated in a dynamic fashion, although known gaps in the compartmentalization of kernel resources require trade-offs between the security and interoperability to be made in the design of such containers.
  • Martinmäki, Petri (2013)
    The main purpose of this master's thesis is to present experiences of test automation in an industrial case, and to make recommendations of the best practices for the case. The recommendations are based on successful test automation stories found from the existing literature. The main issues that are hindering test automation seem to be similar in the case example and in the literature. The cost of implementation and maintenance combined with unrealistic expectations are perceived in almost every project. However, in the most successful projects, they have put a lot of effort to plan and implement maintainable sets of automatic tests. In conclusion, the evidence from the literature shows that successful test automation needs investments - especially in the beginning of the project. A few specific best practices are adapted to the case project, and presented in the form that they could be applied.
  • Lübbers, Henning (2012)
    General-purpose lossless data compression continues to be an important aspect of the daily use of computers, and a multitude of methods and corresponding compression programs has emerged since information theory was established as a distinct area of research by C.E. Shannon shortly after World War II. Shannon and others discovered several theoretical bounds of data compression and it is, for instance, nowadays known that there can be neither a compressor that can compress all possible inputs, nor any mechanism that compresses at least some inputs while preserving the lengths of the incompressible ones. Although it is therefore indisputable that any compressor must necessarily expand some inputs, it is nonetheless possible to limit the expansion of any input to a constant number of bits in worst case. In order to determine how the established theoretical bounds relate to existing compression programs, I examined two popular compressors, GZip and BZip2, and concluded that their behaviour is not optimal in all respects as they may expand inputs in worst case more than theoretically necessary. On the other hand, the examined programs provide very good compression for most realistic inputs rather swiftly, a characteristic that is most likely appreciated by most computer users. Motivated by a review of the most fundamental bounds of data compression, i.e. Kolmogorov Complexity, Entropy and the Minimum Description Length Principle, and further encouraged by our analysis of GZip and BZip2, I propose a generic, pipelined architecture in this thesis that can — at least in theory - be employed to achieve optimal compression in two passes over the input to be compressed. I subsequently put the proposed architecture directly to the test, and use it to substantiate my claim that the performance of compression (boosting) methods can be improved if they are configured for each input anew with a dynamically discovered set of optimal parameters. In a simple empirical study I use Huffman Coding, a classic entropy-based compression method, as well as Move-To-Front Coding (MTF), a compression boosting method designed to exploit locality among source symbols, to demonstrate that the choice of implied source alphabet influences the achieved compression ratio and that different test inputs require different source alphabets to achieve optimal compression.
  • Raatikainen, Marko (2013)
    Regulations for medical device software require that some of the tools used in development are validated. This thesis takes a look at how the tool validation is done in GE Healthcare. Guidelines for more efficient tool validation are presented and tested out in practice. Final version of the guidelines are evaluated by 9 experts. According to the experts the initial validation done by following the guidelines will be somewhat faster, and subsequent validations will be significantly faster. However, the guidelines take advantage of automation, including automation of tests. More time is required, if the tools are unfamiliar, or if the tool being validated is not easy to test automatically.
  • Saarikoski, Kasperi (2016)
    Network-intensive smartphone applications are becoming increasingly popular. Examples of such trending applications are social applications like Facebook that rely on always-on connectivity as well as multimedia streaming applications like YouTube. While the computing power of smartphones is constantly growing, the capacity of smartphone batteries is lagging behind. This imbalance has created an imperative for energy-efficient smartphone applications. One approach to increase the energy efficiency of smartphone applications is to optimize their network connections via traffic shaping. Many existing proposals for shaping smartphone network traffic depend on modifications on the smartphone OS, applications, or both. However, most modern smartphone OSes support establishing Virtual Private Networks (VPNs) from user-space applications. Our novel approach to traffic shaping takes advantage of this. We modified OpenVPN tunneling software to perform traffic shaping by altering TCP flow control on tunneled packets. Subjecting heterogenous network connections to traffic shaping without insight into traffic patterns causes serious problems to certain applications. One example of such applications are multimedia streaming applications. We developed a traffic identification feature which creates a mapping between Android applications and their network connections. We leverage this feature to selectively opt-out of shaping network traffic sensitive to traffic shaping. We demonstrate this by selectively shaping background traffic in the presence of multimedia traffic. The purpose of the developed traffic shaper is to enhance the energy efficiency of smartphone applications. We evaluate the traffic shaper by collecting network traffic traces and assessing them with an RRC simulator. The four performed experiments cover multimedia streaming traffic, simulated background traffic and concurrent multimedia and background traffic produced by simulation applications. We are able to enhance the energy efficiency of network transmissions across all experiments.
  • Niskanen, Andreas (2017)
    Computational aspects of argumentation are a central research topic of modern artificial intelligence. A core formal model for argumentation, where the inner structure of arguments is abstracted away, was provided by Dung in the form of abstract argumentation frameworks (AFs). AFs are syntactically directed graphs with the nodes representing arguments and edges representing attacks between them. Given the AF, sets of jointly acceptable arguments or extensions are defined via different semantics. The computational complexity and algorithmic solutions to so-called static problems, such as the enumeration of extensions, is a well-studied topic. Since argumentation is a dynamic process, understanding the dynamic aspects of AFs is also important. However, computational aspects of dynamic problems have not been studied thoroughly. This work concentrates on different forms of enforcement, which is a core dynamic problem in the area of abstract argumentation. In this case, given an AF, one wants to modify it by adding and removing attacks in a way that a given set of arguments becomes an extension (extension enforcement) or that given arguments are credulously or skeptically accepted (status enforcement). In this thesis, the enforcement problem is viewed as a constrained optimization task where the change to the attack structure is minimized. The computational complexity of the extension and status enforcement problems is analyzed, showing that they are in the general case NP-hard optimization problems. Motivated by this, algorithms are presented based on the Boolean optimization paradigm of maximum satisfiability (MaxSAT) for the NP-complete variants, and counterexample-guided abstraction refinement (CEGAR) procedures, where an interplay between MaxSAT and Boolean satisfiability (SAT) solvers is utilized, for problems beyond NP. The algorithms are implemented in the open source software system Pakota, which is empirically evaluated on randomly generated enforcement instances.
  • Repo, Laura (2015)
    Tutkielmassa tehtiin esitutkimus Oulun kaupungin sähköisen OmaOulu-portaalin päiväkotisivuston käytettävyydestä ja sopivuudesta sen käyttäjien tarpeisiin. Vanhan ja uuden päiväkotisivun ominaisuuksia selvitettiin lasten vanhempien yksilöhaastattelulla. Päiväkodin arkeen tutustuttiin varjostamalla (shadowing) yhtä päiväkodin työntekijää ja sivustoon liittyviä käyttötilanteita kerättiin kontekstisidonnaisen tiedustelun (contextual interview) avulla. Tutkimuksessa haettiin palautetta uusien päiväkotisivujen käytettävyydestä suhteessa vanhoihin päiväkotisivuihin. Taustatieto päiväkodin toiminnasta, käyttötilanteet ja tiedusteluiden paljastamat ongelmat käyttöliittymässä auttavat suunnittelemaan päiväkotisivujen käytettävyyden tutkimusta jatkossa. Tutkielma esittelee joitakin havainnointi- ja haastattelumenetelmiä, joilla voidaan selvittää käyttötilanteita ja käyttäjien tarpeita. Varjostaminen ja kontekstisidonnainen tiedustelu edustavat eri havainnointimenetelmiä. Käyttäjän varjostamisella viitataan tilanteeseen, jossa tutkija selvittää käyttäjän työhön liittyviä asioita seuraamalla hänen toimiaan. Kontekstisidonnaisessa tiedustelussa tutkija ja käyttäjä tutustuvat käyttäjän työhön yhdessä keskustelemalla. Haastattelu voidaan suorittaa kyselyn (questionnaire), etnografisen haastattelun (ethnographic interview) tai kohderyhmähaastattelun (focus groups) avulla. Kyselyt ovat lomakehaastatteluita, joihin vastaajat vastaavat itsenäisesti. Etnografinen haastattelu taas on suullinen yksilöhaastattelu haastateltavan työympäristössä. Kohderyhmähaastatteluissa haastattelutilanteeseen osallistuu yhtä aikaa useita kohderyhmään kuuluvia henkilöitä. Tulosten luotettavuuteen vaikutti se, että kontekstisidonnaisten tiedusteluiden aikana käyttäjät eivät suorittaneet riittävästi aitoja työtehtäviä. Tiedustelutilanne ei myöskään vastannut aitoa työtilannetta siksi, että henkilökunta käyttää todellisuudessa päiväkotisivua vähän aikaa kerrallaan: päivä lasten kanssa on tiivis ja intensiivinen. Näistä syistä johtuen kontekstisidonnainen tiedustelu ei välttämättä ollut tilanteeseen parhaiten sopiva menetelmä. Haastatteluiden tuloksista käy ilmi, että pääosa päiväkodin henkilökunnasta ja vähän yli puolet lasten vanhemmista suhtautuu uuteen päiväkotisivustoon jossakin määrin hyväksyvästi. On kuitenkin vaikeaa arvioida, onko päiväkodin henkilökunnan työtaakka lisääntynyt sivuston ylläpidon vuoksi. Koska pääosa vanhemmille suunnatuista ominaisuuksista on päiväkotisivuston sisällön kuten kuukausitiedotteiden ja tapahtumakalenterin tarkastelua, tulisi tutkia mahdollisuutta lähettää tarpeellinen tieto sähköpostitse. Osa lasten vanhempien tarvitsemista toiminnoista, kuten poissaoloilmoitusten teko, on jatkossakin välttämätöntä hoitaa sähköisesti. Jatkossa tulee suorittaa lisää varjostamistutkimuksia sekä haastatteluita, jotta nähdään, mitä hyötyjä sivuston käyttö tuo käyttäjilleen.
  • Wallenius, Otto (2017)
    Lempel-Ziv-koodi on merkkijonojen tiivistysmenetelmä, jossa merkkijono esitetään korvaamalla toistuvasti esiintyvät osajonot osoittimilla johonkin osajonon aiempaan esiintymään. Lempel-Ziv-koodia on tutkittu paljon, ja se on käytössä useissa tiivistysohjelmissa kuten gzip, 7-zip ja Zstandard. Tämä tutkielma esittelee erilaisia osoittimien esitystapoja käyden läpi aiheeseen liittyvää kirjallisuutta ja koodin toteutuksia. Etäisyystoisto- ja etäisyystoistoerotussymbolien käyttöä täydentävänä esitystapana tutkittiin kokeellisesti. Sen havaittiin pienentävän etäisyysaakkoston entropiaa ja hieman parantavan Lempel-Ziv-koodin tiivistyssuhdetta. Parannus tiivistyssuhteeseen kokeita varten tehdyllä Lempel-Ziv-koodaajalla vaihteli paljon syötemerkkijonojen välillä ollen suurimmillaan n. 1 prosenttiyksikkö. ACM CCS 2012: Information systems ~ Data compression
  • Hou, Jian (2014)
    Pichia pastoris and Saccharomyces cerevisiae are two important fungi in both research and industrial applications of protein production and genetic engineering due to the inherent ability. For example, S.cerevisiae can produce important proteins from wide ranged sugar from ligno-cellulose to methanol. Accurate genome-scale metabolic networks (GMNs) of the two fungi can improve biotechnological production efficiency, drug discovery and cancer research. Comparison of metabolic networks between fungi brings a new way to study the evolutionary relationship between them. There are two basic steps for modeling metabolic networks. The first step is to construct a draft model from existing model or softwares such as the pathway tool software and InterProScan. The second step is model simulation in order to construct a gapless metabolic network. There are two main methods for genome-wide metabolic network reconstruction: constraint-based methods and graph-theoretical pathway finding methods. Constraints-based methods used linear equations to simulate the growth under your model with different constraints. Graph-theoretical pathway finding methods use graphic approach to construct the gapless model so that each metabolite can be acquired from either nutritions or the products of other gapless reactions. In my thesis, a new method designed by Pitkänen [PJH+ 14] is used to reconstruct the metabolic networks of Pichia pastoris and Saccharomyces cerevisiae. Five experiments were developed to evaluate the accuracy of the CoReCo method. The first experiment was to analyze the quality of the GMNs of Pichia pastoris and Saccharomyces cerevisiae by comparing with the existing model. The second and third experiments tested the stability of CoReCo constructed under random mutation and random deletion of the protein sequence simulating noisy input data. The next two experiments were done by considering different number of phylogenetic neighbors in the phylogenetic tree. The last experiment tested the effect of the two main parameters (acceptance and rejection thresholds) when CoReCo filled the reaction gaps in the final step.
  • Xiao, Han (2016)
    We study the problem of detecting top-k events from digital interaction records (e.g, emails, tweets). We first introduce interaction meta-graph, which connects associated interactions. Then, we define an event to be a subset of interactions that (i) are topically and temporally close and (ii) correspond to a tree capturing information flow. Finding the best event leads to one variant of prize-collecting Steiner-tree problem, for which three methods are proposed. Finding the top-k events maps to maximum k-coverage problem. Evaluation on real datasets shows our methods detect meaningful events.
  • Laster, Zachary Howell (2014)
    Artificial agents are commonly used in games to simulate human opponents. This allows players to enjoy games without requiring them to play online or with other players locally. Basic approaches tend to suffer from being unable to adapt strategies and often perform tasks in ways very few human players could ever achieve. This detracts from the immersion or realism of the gameplay. In order to achieve more human-like play more advanced approaches are employed in order to either adapt to the player's ability level or to cause the agent to play more like a human player can or would. Utilizing artificial neural networks evolved using the NEAT methodology, we attempt to produce agents to play a FPS-style game. The goal is to see if the approach produces well-playing agents with potentially human-like behaviors. We provide a large number of sensors and motors to the neural networks of a small population learning through co-evolution. Ultimately we find that the approach has limitations and is generally too slow for practical application, but holds promise for future developments. Many extensions are presented which could improve the results and reduce training times. The agents learned to perform some basic tasks at a very rough level of skill, but were not competitive at even a beginner level.
  • Sorkhei, Amin (2016)
    With the fast growing number of scientific papers produced every year, browsing through scientific literature can be a difficult task: formulating a precise query is not often possible if one is a novice in a given research field or different terms are often used to describe the same concept. To tackle some of these issues, we build a system based on topic models for browsing the arXiv repository. Through visualizing the relationship between keyphrases, documents and authors, the system allows the user to better explore the document search space compared to traditional systems based solely on query search. In this paper, we describe the design principles and the functionality supported by this system as well as report on a short user study.