Browsing by department "Tietojenkäsittelytieteen laitos"

Now showing items 1-20 of 508

3D-animaation toteutus peliohjelmoinnissa

Tuulos, Natalia (2014)

Animaatiotekniikat tarjoavat tehokkaita menetelmiä hahmojen ja jäykkien kappaleiden animoimiseen. Tutkielmassa kuvataan yleisemmät 3D-animaatiotekniikat unohtamatta muita edeltäneitä tekniikoita, tutustutaan grafiikkaluikuhihnaan ja luurankomalliin sekä käydään läpi peliresurssien hallintaan, luontiin, käsittelyyn ja talletukseen liittyviä asioita. Lisäksi tutkielmassa tutustutaan animaatiojärjestelmän jäsentelyyn ohjelmistoarkkitehtuurin näkökulmasta sekä esitellään Unreal- ja Unity-pelimoottorit ja niiden animaatiojärjestelmät. Tutkielman konstruktiivisessa osuudessa kuvataan molemmissa pelimoottoreissa toteutetut esimerkkisovellukset ja toteutukseen liittyvät kokemukset sekä tehdään kokemuksiin perustuva vertaileva analyysi.
3GPP LTE Release 9 and 10 requirement analysis to physical layer UE testing

Johansson, Tomi (2013)

The purpose of this thesis was to analyze the testing requirements to physical layer features which are used in LTE Release 9 and 10 timeframe. The aim of the analysis was to define test case requirements for new features from the physical layer point of view. This analysis can then be utilized to implement and design test cases using commercial eNB simulators. The analysis was carried out by studying the 3GPP specifications and by investigating the integration and system level testing requirements. Different feature specific parameters were evaluated and different testing aspects were studied in order to verify the functionalities and performance of the UE. Also, different conformance test case scenarios and field testing aspects were investigated in order to improve the test case planning in the integration and system testing phase. The analysis showed that in Rel-9 there are two main features which have a great impact on the Rel-9 physical layer testing. These two features are the dual-layer beamforming and UE positioning which is done with OTDOA and E-CID methods. It was analyzed that the requirements for the downlink dual-layer beamforming focus on TDD side and the test plan must contain especially throughput performance testing in integration and system phase testing. OTDOA and E-CID methods, on the other hand, need test plans which are concentrating on the positioning accuracy. In Rel-10, the analysis showed that there are plenty of new features on physical layer to ensure the transition from LTE to LTE-Advanced. The main requirements were assigned for the CA feature which has testing activities especially on the UE feedback operations. Also, different kinds of CA deployment scenarios were analyzed to evaluate more closely the initial CA testing scenarios in integration and system testing. Analysis continued with downlink multi-layer beamforming where the requirements were seen to concentrate on new CSI-RS aspects and to throughput performance testing. Uplink MIMO aspects were analyzed at the end and the studies showed that this feature may have a minor role in Rel-10 timeframe and therefore it does not have any important testing requirements which should be taken into account in test plans.
A botnet survey

Sairanen, Samuli (2013)

The term 'botnet' has surfaced in the media in recent years, showing the complications of manipulation of massive amounts of computers. These zombie computers provide platform for many illegal or otherwise shady actions, like spam mailing and denial of service-attacks. The power of such networks is noticeable on global scale, but what is really known about these networks? Why do they exist in the first place? How are botnets being created today and how they work on a technical level? How is the emergence of mobile internet computing affecting botnet creation and what kind of possibilities does it offer? Goal of this work is to illustrate the history of botnets, understand the structure of botnets, how they are built, and how they affect the whole internet culture. Also the methods for fighting against the threat of botnets are speculated.
Accountable De-anonymization in V2X Communication

Silvennoinen, Aku (2018)

De-anonymization is an important requirement in real-world V2X systems (e.g., to enable effective law enforcement). In de-anonymization, a pseudonymous identity is linked to a long-term identity in a process known as pseudonym resolution. For de-anonymization to be acceptable from political, social and legislative points of view, it has to be accountable. A system is accountable if no action by it or using it can be taken without some entity being responsible for the action. Being responsible for an action means that the responsible entity cannot deny its responsibility of or relation to an action afterwards. The main research question is: How can we achieve accountable pseudonym resolution in V2X communication systems? One possible answer is to develop an accountable de-anonymization service, which is compatible with existing V2X pseudonym schemes. The accountability can be achieved by making some entities accountable for the de-anonymization. This thesis proposes a system design that enables, i) fine-grained pseudonym resolution; ii) the possibility to inform the subject of the resolution after a suitable time delay; and iii) the possibility for the public to audit the aggregate number of pseudonym resolutions. A TEE is used to ensure these accountability properties. The security properties of this design are verified using symbolic protocol analysis.
A Comparative Study on Large-scale Multi-label Text Classification of Social Media

Huang, Biyun (2018)

Text classification, also known as text categorization, is a task to classify documents into predefined sets. As the prosperity of the social networks, a large volume of unstructured text is generated exponentially. Social media text, due to its limited length, extreme imbalance, high dimensionality, and multi-label characteristic, needs special processing before being fed to machine learning classifiers. There are all kinds of statistics, machine learning, and natural language processing approaches to solve the problem, of which two trends of machine learning algorithms are the state of the art. One is the large-scale linear classification which deals with large sparse data, especially for short social media text; the other is the active deep learning techniques, which takes advantage of the word order. This thesis provided an end-to-end solution to deal with large-scale, multi-label and extremely imbalanced text data, compared both the active trends and discussed the effect of balance learning. The results show that deep learning does not necessarily work well in this context. Well-designed large linear classifiers can achieve the best scores. Also, when the data is large enough, the simpler classifiers may perform better.
A comparison of two SPLE tools : Pure::Variants and Clafer tools

Oksanen, Miika (2018)

In software product line engineering (SPLE), parts of developed software is made variable in order to be able to build a whole range of software products at the same time. This is widely known to have a number of potential benefits such as saving costs when the product line is large enough. However, managing variability in software introduces challenges that are not well addressed by tools used in conventional software engineering, and specialized tools are needed. Research questions: 1) What are the most important requirements for SPLE tools for a small-to-medium sized organisation aiming to experiment with SPLE? 2) How well those requirements are met in two specific SPLE tools, Pure::Variants and Clafer tools? 3) How do the studied tools compare against each other when it comes to their suitability for the chosen context (a digital board game platform)? 4) How common requirements for SPL tools can be generalized to be applicable for both graphical and text-based tools? A list of requirements is first obtained from literature and then used as a basis for an experiment where support for each requirement is tried out with both tools. Then a part of an example product line is developed with both tools and the experiences reported on. Both tools were found to support the list of requirements quite well, although there were some usability problems and not everything could be tested due to technical issues. Based on developing the example, both tools were found to have their own strengths and weaknesses probably partly resulting from one being GUI-based and one textual. ACM Computing Classification System (CCS): (1) CCS → Software and its engineering → Software creation and management → Software development techniques → Reusability → Software product lines (2) CCS → Software and its engineering → Software notations and tools → Software configuration management and version control systems
A Cuckoo Search Algorithm for Bayesian Network Structure Learning

Mubarok, Mohamad Syahrul (2017)

A Bayesian Network (BN) is a graphical model applying probability and Bayesian rule for its inference. BN consists of structure, that is a directed acyclic graph (DAG), and parameters. The structure can be obtained by learning from data. Finding an optimal BN structure is an NP-Hard problem. If an ordering is given, then the problem becomes simpler. Ordering means the order of variables (nodes) for building the structure. One of structure learning algorithms that uses variable ordering as the input is K2 algorithm. The ordering determines the quality of resulted network. In this work, we apply Cuckoo Search (CS) algorithm to find a good node ordering. Each node ordering is evaluated by K2 algorithm. Cuckoo Search is a nature-inspired metaheuristic algorithm that mimics the aggressive breeding behavior of Cuckoo birds with several simplifications. It has outperformed Genetic Algorithms and Particle Swarm Optimization algorithm in finding an optimal solution for continuous problems, e.g., functions of Michalewicz, Rosenbrock, Schwefel, Ackley, Rastrigin, and Griewank. We conducted experiments on 35 datasets to compare the performances of Cuckoo Search to GOBNILP that is a Bayesian network learning algorithm based on integer linear programming and it is well known to be used as benchmark. We compared the quality of obtained structures and the running times. In general, CS can find good networks although all the obtained networks are not the best. However, it sometimes finds only low-scoring networks, and the running times of CS are not always very fast. The results mostly show that GOBNILP is consistently faster and can find networks of better quality than CS. Based on the experiment results, we conclude that the approach is not able to guarantee obtaining an optimal Bayesian network structure. Other heuristic search algorithms are potentially better to be used for learning Bayesian network structures that we have not compared to our works, for example the ordering-search algorithm by Teyssier and Koller [41] that combines greedy local hill-climbing with random restarts, a tabu list, caching computations, and a heuristic pruning procedure.
A Distributed Publish/Subscribe Architecture for Telecommunications Network Management

Havukainen, Heikki (2015)

Managing a telecommunications network requires collecting and processing a large amount of data from the base stations. The current method used by the infrastructure providers is hierarchical and it has significant performance problems. As the amount of traffic within telecommunications networks is expected to continue increasing rapidly in the foreseeable future, these performance problems will become more and more severe. This thesis outlines a distributed publish/subscribe solution that is designed to replace the current method used by the infrastructure providers. In this thesis, we propose an intermediate layer between the base stations and the network management applications which will be built on top of Apache Kafka. The solution will be qualitatively evaluated from different aspects. ACM Computing Classification System (CCS): Networks -> Network management Networks -> Network architectures
Advances in Streamlining Software Delivery on the Web and its Relations to Embedded Systems

Hirvikoski, Kasper (2015)

Software delivery has evolved notably over the years, starting from plan-driven methodologies and lately moving to principles and practises shaped by Agile and Lean ideologies. The emphasis has moved from thoroughly documenting software requirements to a more people-oriented approach of building software in collaboration with users and experimenting with different approaches. Customers are directly integrated into the process. Users cannot always identify software needs before interacting with actual implementations. Building software is not only about building products in the right way, but also about building the right products. Developers need to experiment with different approaches, directly and indirectly. Not only do users value practical software, but the development process must also emphasise on the quality of the product or service. Development processes have formed to support these ideologies. To enable a short feedback-cycle, features are deployed often to production. A software is primarily delivered through a pipeline consisting of tree stages: development, staging and production. Developers develop features by writing code, verify these by writing related tests, interact and test software in a production-like 'staging' environment, and finally deploy features to production. Many practises have formed to support this deployment pipeline, notably Continuous Integration, Deployment and Experimentation. These practises focus on improving the flow of how software is being developed, tested, deployed and experimented with. The Internet has provided a thriving environment for using new practises. Due to the distributed nature of the web, features can be deployed without the need of any interaction from users. Users might not even notice the change. Obviously, there are other environments where many of these practises are much harder to achieve. Embedded systems, which have a dedicated function within a larger mechanical or electrical system, require hardware to accompany the software. Related processes and environments have their limitations. Hardware development can only be iterative to a certain degree. Producing hardware takes up front design and time. Experimentation is more expensive. Many stringent contexts require processes with assurances and transparency - usually provided by documentation and long-testing phases. In this thesis, I explore how advances in streamlining software delivery on the web has influenced the development of embedded systems. I conducted six interviews with people working on embedded systems, to get their view and incite discussion about the development of embedded systems. Though many concerns and obstacles are presented, the field is struggling with the same issues that Agile and Lean development are trying to resolve. Plan-driven approaches are still used, but distinct features of iterative development can be observed. On the leading edge, organisations are actively working on streamlining software and hardware delivery for embedded systems. Many of the advances are based on how Agile and Lean development are being used for user-focused software, particularly on the web.
A feasible telemedicine monitoring solution

Niu, Yimeng (2016)

While health establishes the basis of our life, at times we need to visit doctors or hospitals. On that, patients may be faced with inequalities, for example, due to distances to the healthcare resources. With the development of telecommunications and the internet of things, telemedicine may assist in such cases, saving travel time and cost. This thesis suggests a telemedicine monitoring solution for both hospital based and personal users. The focus is on the architecture of the system, the role of wireless sensors in telemedicine and telemedicine key technologies (such as: Bluetooth and ZigBee). Further, the software structure for monitoring the patients' physiological state remotely at hospital and at home is suggested. This involves also the choice of suitable hardware for data acquisition and wireless transmission. In the end, other related scientific researches are discussed. Comparisons are made between the proposed solution and other similar designs in different angles depending on the focuses of other research work, such as processing performance, connectivity, usability, unit price, data security and decision making.
A Feature-Based Call Graph Distance Measure for Program Similarity Analysis

Linkola, Simo (2016)

A measurement for how similar (or distant) two computer programs are has a wide range of possible applications. For example, they can be applied to malware analysis or analysis of university students' programming exercises. However, as programs may be arbitrarily structured, capturing the similarity of two non-trivial programs is a complex task. By extracting call graphs (graphs of caller-callee relationships of the program's functions, where nodes denote functions and directed edges denote function calls) from the programs, the similarity measurement can be changed into a graph problem. Previously, static call graph distance measures have been largely based on graph matching techniques, e.g. graph edit distance or maximum common subgraph, which are known to be costly. We propose a call graph distance measure based on features that preserve some structural information from the call graph without explicitly matching user defined functions together. We define basic properties of the features, several ways to compute the feature values, and give a basic algorithm for generating the features. We evaluate our features using two small datasets: a dataset of malware variants, and a dataset of university students' programming exercises, focusing especially on the former. For our evaluation we use experiments in information retrieval and clustering. We compare our results for both datasets to a baseline, and additionally for the malware dataset to the results obtained with a graph edit distance approximation. In our preliminary results we show that even though the feature generation approach is simpler than the graph edit distance approximation, the generated features can perform on a similar level as the graph edit distance approximation. However, experiments on larger datasets are still required to verify the results.
Affektiivisuuden laskennallinen määritteleminen

Turkia, Mika (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2007)

One of the most tangled fields of research is the field of defining and modeling affective concepts, i. e. concepts regarding emotions and feelings. The subject can be approached from many disciplines. The main problem is lack of generally approved definitions. However, e.g. linguists have recently started to check the consistency of their theories with the help of computer simulations. Definitions of affective concepts are needed for performing similar simulations in behavioral sciences. In this thesis, preliminary computational definitions of affects for a simple utility-maximizing agent are given. The definitions have been produced by synthetizing ideas from theories from several fields of research. The class of affects is defined as a superclass of emotions and feelings. Affect is defined as a process, in which a change in an agent's expected utility causes a bodily change. If the process is currently under the attention of the agent (i.e. the agent is conscious of it), the process is a feeling. If it is not, but can in principle be taken into attention (i.e. it is preconscious), the process is an emotion. Thus, affects do not presuppose consciousness, but emotions and affects do. Affects directed at unexpected materialized (i.e. past) events are delight and fright. Delight is the consequence of an unexpected positive event and fright is the consequence of an unexpected negative event. Affects directed at expected materialized (i.e. past) events are happiness (expected positive event materialized), disappointment (expected positive event did not materialize), sadness (expected negative event materialized) and relief (expected negative event did not materialize). Affects directed at expected unrealized (i.e. future) events are fear and hope. Some other affects can be defined as directed towards originators of the events. The affect classification has also been implemented as a computer program, the purpose of which is to ensure the coherence of the definitions and also to illustrate the capabilities of the model. The exact content of bodily changes associated with specific affects is not considered relevant from the point of view of the logical structure of affective phenomena. The utility function need also not be defined, since the target of examination is only its dynamics.
A Floor Control Server in a Distributed Conference Service

Koponen, Aila Helena (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2008)

The conferencing systems in IP Multimedia (IM) networks are going through restructuring, accomplished in the near future. One of the changes introduced is the concept of floors and floor control in its current form with matching entity roles. The Binary Floor Control Protocol (BFCP) is a novelty to be exploited in distributed tightly coupled conferencing services. The protocol defines the floor control server (FCS), which implements floor control giving access to shared resources. As the newest tendency is to distribute the conferencing services, the locations of different functionality units play an important role in developing the standards. The floor control server location is not yet single-mindedly fixed in different standardization bodies, and the debate goes on where to place it within the media server, providing the conferencing service. The thesis main objective is to evaluate two distinctive alternatives in respect the Mp interface protocol between the respective nodes, as the interface in relation to floor control is under standardization work at the moment. The thesis gives a straightforward preamble in IMS network, nodes of interest including floor control server and conferencing. Knowledge on several protocols – BFCP, SDP, SIP and H.248 provides an important background for understanding the functionality changes introduced in the Mp interface and therefore introductions on those protocols and how they are connected to the full picture is given. The actual analysis on the impact of the floor control server into the Mp reference point is concluded in relation to the locations, giving basic flows, requirements analysis including a limited implementation proposal on supporting protocol parameters. The overall conclusion of the thesis is that even if both choices are seemingly useful, not one of the locations is clearly the most suitable in the light of this work. The thesis suggests a solution having both possibilities available to be chosen from in separate circumstances, realized with consistent standardization. It is evident, that if the preliminary assumption for the analysis is kept regarding to only one right place for the floor control server, more work is to be done in connected areas to discover the one most appropriate location.
Agile Game Development : A Systematic Literature Review

Ruonala, Henna-Riikka (2017)

A systematic literature review was conducted to examine the usage of agile methods in game development. A total of 23 articles were found which were analysed with the help of concept matrices. The results indicate that agile methods are used to varying degrees in game development. Agile methods lead to improved quality of games through a prototyping, playtesting, and feedback loop. Communication and ability of the team to take responsibility are also enhanced. Challenges arise from multidisciplinary teams, management issues, lack of training in agile methods, and quality of code.
Agile Methodologies in Large Scale Information Systems Project Context : A Literature Review and Reflections

Aintila, Eeva Katri Johanna (2016)

Expected benefits from agile methodologies to project success have encouraged organizations to extend agile approaches to areas they were not originally intended to such as large scale information systems projects. Research regarding agile methods in large scale software development projects have existed for few years and it is considered as its own research area. This study investigates agile methods on the large scale software development and information systems projects and its goal is to produce more understanding of agile methods suitability and the conditions under which they would most likely contribute to project success. The goal is specified with three research questions; I) what are the characteristics specific to large scale software engineering projects or large scale Information Systems project, II) what are the challenges caused by these characteristics and III) how agile methodologies mitigate these challenges? In this study resent research papers related to the subject are investigated and characteristics of large scale projects and challenges associated to them are recognized. Material of the topic was searched starting from the conference publications and distributions sites related to the subject. Collected information is supplemented with the analysis of project characteristics against SWEBOK knowledge areas. Resulting challenge categories are mapped against agile practises promoted by Agile Alliance to conclude the impact of practises to the challenges. Study is not a systematics literature review. As a result 6 characteristics specific to large scale software development and IS projects and 10 challenge categories associated to these characteristics are recognized. The analysis reveals that agile practises enhance the team level performance and provide direct practises to manage challenges associated to high amount of changes and unpredictability of software process both characteristic to a large scale IS project but challenges still remain on the cross team and overall project level. As a conclusion it is stated that when seeking the process model with agile approach which would respond to all the characteristics of large scale project thus adding the likelihood of project success adaptations of current practises and development of additional practises are needed. To contribute this four areas for adaptations and additional practises are suggested when scaling agile methodologies over large scale project contexts; 1) adaptation of practises related to distribution, assignment and follow up of tasks, 2) alignment of practises related to software development process, ways of working and common principles over all teams, 3) developing additional practises to facilitate collaboration between teams, to ensure interactions with the cross functional project dimensions and to strengthen the dependency management and decision making between all project dimensions and 4) possibly developing and aligning practises to facilitate teams' external communication. Results of the study are expected to be useful for software development and IS project practitioners when considering agile method adoptions or adaptations in a large scale project context. ACM Computing Classification System (CCS) 2012: - Social and professional topics~Management of computing and information systems - Software and its engineering~Software creation and management
Aivokäyttöliittymät

Kallonen, Susanna (2013)

Aivokäyttöliittymätutkimus on nuori, poikkitieteellinen tutkimusala, jonka pyrkimyksenä on kehittää ajatuksen voimalla toimivia käyttöliittymiä lääketieteellisistä häiriöistä kärsiville apu- ja kuntoutusvälineiksi sekä terveille ihmisille viihde- ja hyötykäyttöön. Aivokäyttöliittymät mahdollistavat ihmisen aivojen ja tietokoneen välille uudenlaisen, suoran viestinvälitysyhteyden, joka ei ole riippuvainen ääreishermostosta ja lihaksista. Tässä tutkielmassa kartoitetaan aivokäyttöliittymien aihealueesta tehtyä tutkimusta sekä perehdytään aivokäyttöliittymien sovellusalueisiin ja toteutusperiaatteisiin. Aivokäyttöliittymillä pystytään jo nykyään parantamaan vaikeasti liikuntakyvyttömien ihmisten elämänlaatua tarjoamalla heille tavan kommunikoida ympäristönsä kanssa. Aivokäyttöliittymän avulla he pystyvät kirjoittamaan virtuaalisella tietokoneen näppäimistöllä pelkästään ajatuksen voimalla. Tekniikan hyödyntämistä raajaproteesien liikuttamiseen, pyörätuolin ohjaamiseen, epilepsian oireiden lievittämiseen, tietokonepelien pelaamiseen ja lukuisiin muihin käytännön sovelluksiin tutkitaan parhaillaan. Aivokäyttöliittymien toiminnan perustana voi olla invasiivinen mittaustekniikka, jossa aivojen toimintaa mitataan kallon sisältä, tai ei-invasiivinen mittaustekniikka, jossa mittaus tehdään päänahan ulkopuolelta. Tutkielmassa selviää, että sekä invasiivisella että ei-invasiivisella tekniikalla pystytään toteuttamaan toimivia aivokäyttöliittymiä. Invasiiviset menetelmät soveltuvat parhaiten sovelluksiin, joiden toiminta vaatii hyvää signaalin tarkkuutta ja joiden kohderyhmänä ovat sairaat tai vammautuneet henkilöt. Ei-invasiiviset menetelmät sopivat sovelluksiin, joissa pienempi mittaustarkkuus riittää tai joita käyttävät myös terveet henkilöt. Tutkielmassa todetaan, että aivokäyttöliittymät soveltuvat sekä terveille ihmisille että erilaisista lääketieteellisistä häiriöistä kärsiville. Lisäksi otetaan kantaa siihen, minkälaisia aivokäyttöliittymäsovelluksia kannattaa kehittää perustaen käsitys esiteltyyn tutkimustietoon. Tätä tulosta verrataan haastatteluun, jossa kartoitetaan aivokäyttöliittymien kohderyhmään kuuluvan henkilön ajatuksia aivokäyttöliittymistä, niiden sovelluskohteista ja niille asetettavista vaatimuksista. Haastattelun tuloksena löydetään viisi uutta, aiemmin tutkimatonta, aivokäyttöliittymien sovelluskohdetta: nielun puhdistamiseen tarkoitettu limaimuri, kirjoitetun tekstin ääneen lausuva puhesyntetisaattori, nostolaite jolla henkilö voi nostaa itsensä sängystä, pesun suorittava WC-istuin ja monitoiminen, ruokailussa ja asennon vaihtamisessa avustava laite. Lisäksi tunnistetaan kaksi uutta vaatimusta aivokäyttöliittymille: tarve huomioida sovellusten helppokäyttöisyys avustajien näkökulmasta ja vaatimus aivokäyttöliittymien joustavuudesta eli siitä, että yhdellä aivokäyttöliittymällä pystyisi suorittamaan monia erilaisia toimintoja. Haastattelun perusteella vahvistuu käsitys siitä, että loppukäyttäjät kannattaa ottaa mukaan aivokäyttöliittymien kehitystyöhön ja näkökulmaksi tulisi ottaa entistä enemmän käyttäjälähtöisyys, joka nykyisin ei ole ollut tutkimusten lähtökohtana.
Aktoripohjainen pelimoottoriarkkitehtuuri

Hietasaari, Antti (2016)

Tutkielmassa arvioidaan aktoripohjaisten rinnakkaistamisratkaisujen soveltuvuutta pelimoottoreihin. Tutkielmassa esitellään ensin pelimoottoreiden ja aktoripohjaisen rinnakkaisuuden perusperiaatteet ja sitten aktoripohjainen Stage-pelimoottoritoteutus. Tutkielman lopuksi tutkitaan Stage-moottorin tehokkuutta ja helppokäyttöisyyttä verrattuna perinteisiä lukkopohjaisia rinnakkaistamisratkaisuja hyödyntävään pelimoottoriin.
Alimerkkijonot suomen sanojen vektoriesitysten tuottamisessa neuroverkoilla

Hyvärinen, Ada-Maaria (2019)

Sanojen vektoriesityksiä käytetään moniin luonnollista kieltä käsitteleviin koneoppimistehtäviin, kuten luokitteluun, tiedonhakuun ja konekääntämiseen. Ne ilmaisevat sanat tietokoneelle ymmärrettävässä muodossa. Erityisen hyödyllinen tapa esittää sanat vektoreina on esittää sanat pisteinä jatkuvassa sana-avaruudessa, jolla on joitakin satoja ulottuvuuksia. Tällaisessa mallissa samankaltaiset sanat sijaitsevat avaruudessa lähekkäin, ja sanavektorien erotukset kuvaavat sana-analogiasuhteita, jos vektorit on tuotettu siihen tarkoitukseen luodulla neuroverkolla. Pelkästään tällaisia vektoreita katsomalla saadaan tietää jotakin sanan merkityksestä ja muodosta. Perinteisesti sanavektoreita opettaessa on käsitelty opetusaineiston sanat erillisinä merkkijonoina. Englannin kielessä tämä on usein toimiva menetelmä. Suomen kieli taas on vahvasti taivuttava, joten myös sananmuodot sisältävät paljon informaatiota. Osa informaatiosta menee hukkaan, jos sanat opetetaan kokonaan erillisinä. Lisäksi malli ei osaa yhdistää kahta saman sanan sanamuotoa toisiinsa. FastText-mallit ratkaisevat taivuttamisen ja johtamisen tuomat ongelmat hyödyntämällä tietoa sanojen sisältämistä alimerkkijonoista. Vektoriesitysmalli opetetaan siis paitsi sanojen, myös niiden sisältämien lyhyempien merkkijonojen perusteella. Tämän takia fastText-mallin voisi ajatella toimivan hyvin paljon taivuttavilla kielillä, kuten suomella. Tässä tutkielmassa on haluttu selvittää, toimiiko fastText-menetelmä hyvin suomen kielellä. Lisäksi on tutkittu, millä parametreilla malli toimii parhaiten. Tutkielmassa on kokeiltu erilaisia alimerkkijonojen pituuksia ja sanavektorin kokoja. Mallin laatua voidaan testata semanttista samankaltaisuutta mittaavilla aineistoilla sekä sana-analogiakyselyillä. Semanttista samankaltaisuutta mittaavissa testeissä tutkitaan, ovatko samaa tarkoittavat sanat lähekkäin vektoriavaruudessa. Aineistot pohjautuvat ihmisarvioijien antamiin pisteytyksiin sanojen samankaltaisuudesta. Sana-analogiatesteissä kokeillaan, onnistuuko malli löytämään analogiaparista puuttuvan sanan vektorilaskutoimituksen perusteella. Analogia-aineistot koostuvat sanapareista, jotka ovat tietyssä analogiasuhteessa keskenään. Analogiat voivat liittyä sanan merkitykseen, kuten ``mies ja nainen'' tai muotoon, kuten ``positiivi ja komparatiivi''. Tutkielmaa varten käännettiin suomeksi kaksi englannin kielellä usein käytettyä datasettiä: semanttista samankaltaisuutta mittaava WS353 ja sana-analogioita sisältävä SSWR, jonka käännöksestä käytetään nimeä SSWR-fi. Käännöksissä huomioitiin se, että monet datasettien sanat eivät käänny suomeen yksikäsitteisesti. SSWR-fi-datasetistä ongelmalliset sanat poistettiin, WS353-datasetin rinnalle taas tehtiin erillinen lyhennetty datasetti WS277-josta ongelmalliset sanat on poistettu. Tutkielmassa havaittiin, että alimerkkijonojen käyttäminen on hyödyllistä suomen kielen käsittelyssä. Semanttista samankaltaisuutta mittaavien testien mukaan mallin laatu parani alimerkkijonojen ansiosta. Sana-analogiatesteissä alimerkkijonojen käyttäminen paransi muotokyselyissä onnistumista, mutta huononsi merkityskyselyissä onnistumista. Tämä johtunee siitä, että muotokyselyt perustuvat sanojen taivuttamiselle ja johtamiselle, mutta merkityskyselyissä sananmuodoilla ei ole juuri väliä.
Älykkäät oppimisjärjestelmät tietojenkäsittelytieteen opetuksessa

Heinonen, Kenny (2015)

Opetuksessa käytetyt materiaalit ovat sisällöltään tyypillisesti muuttumattomia. Kirjat ja kuvat ovat esimerkkejä muuttumattomista opetusmateriaaleista. Muuttumattoman opetusmateriaalin tueksi on kehitetty älykkäitä oppimisjärjestelmiä, jotka ovat yleistyneet tietojenkäsittelytieteen opetuksessa. Älykkään oppimisjärjestelmän ominaispiirteisiin kuuluu dynaamisuus ja interaktiivisuus. Järjestelmän tarjoamien interaktiivisten mekanismien ansiosta käyttäjä voi kommunikoida järjestelmän kanssa ja oppia järjestelmän opettamaa aihetta. Oppimisjärjestelmän tarjoama sisältö ja käyttäjälle annettu palaute vaihtelee käyttäjän syötteen perusteella. Älykkäillä oppimisjärjestelmillä on useita luokituksia, joihin kuuluu niin visualisointi- ja simulointijärjestelmiä, arviointi- ja tuutorointijärjestelmiä, ohjelmointiympäristöjä ja oppimispelejä. Tässä tutkielmassa tarkastellaan millaisia älykkäitä oppimisympäristöjä on olemassa ja mikä niiden käyttötarkoitus on. Järjestelmät kartoitetaan etsimällä niitä SIGCSE konferenssissa julkaistuista artikkeleista vuosilta 2009–2014. Kartoitettujen järjestelmien teknistä toteutusta tutkitaan muutamien yleisluontoisten ominaisuuksien, kuten web-pohjaisuuden, näkökulmasta. Viiden vuoden aikavälillä järjestelmät eivät ole yleisesti ottaen kehittyneet. Älykkäät oppimisjärjestelmät ovat laajasti hyödynnettyjä, mutta ne eivät kuitenkaan ole korvaamassa perinteistä lähiopetusta, vaan ne ovat tarkoitettu enimmäkseen lähiopetuksen tueksi.
A Method for Modeling Uncertainty in Semantic Web Taxonomies

Holi, Markus (Helsingin yliopistoUniversity of HelsinkiHelsingfors universitet, 2004)

Now showing items 1-20 of 508

Browsing by department "Tietojenkäsittelytieteen laitos"

Yhteystiedot

HELSINGIN YLIOPISTO