Browsing by Title

Now showing items 687-706 of 4026

Crustal contamination recorded by zoned plagioclase in primitive Karoo flood basalts, Luenha River, Mozambique

Aaltonen, Milla (2019)

Jurassic (182 Ma) Karoo flood basalt province shows great variety in geochemistry. The complexity is thought to be inherited from distinct mantle sources. Luenha River exposure in northern parts of Mozambique includes primitive picrites possibly representing the still undefined parental magma type for the North Karoo Lavas. The previously determined whole-rock data revealed chondritic to very radiogenic 87Sr/86Sr ratios and nearly chondritic eNd values. The diverse 87Sr/86Sr ratios can result from processes such as e.g. subsolidus alteration, contamination, magma mixing or source heterogeneities, which complicates assessment of petrogenetic processes. To make a contribution to this, plagioclase phenocrysts from six Luenha samples were used as tracers of magma chamber processes. In situ studies on plagioclase growth zones were performed using the CIS methods (crystal isotope stratigraphy). Cold-cathode cathodolumenescence microscopy (CL) was used to visually reveal zonation, the electron microprobe (EMPA) was utilized for major element content (core-to-rim), and laser ablation-multicollector- inductively coupled plasma- mass spectrometry (LA-MC-ICP-MS) was used for in situ (87Sr/86S)i ratio measurements. The anorthite content of plagioclase cores (n = 65) is An65 ̶ 90 and core to rim variations alternate between normal oscillatory to reverse zoning. Is situ isotope examination revealed isotopic disequilibrium in (87Sr/86Sr)i between phenocrysts (cores 0.70511–0.70671, n = 10; rims 0.70539–0.70709, n = 11) and bulk groundmass (0.70660– 0.71061, n = 12). Plagioclase cores are always less radiogenic compared to whole rock (0.70690–0.71019), but internal variation within and between lava flows exists. Core-to-rim microsampling revealed four different (87Sr/86Sr)i evolution paths reflecting heterogeneous crystallization conditions. An, open complex magma plumbing system with progressing contamination is the likely scenario. The relatively radiogenic plagioclase cores compared with the uncontaminated plume-like sample (87Sr/86Sr 0.70410) indicate that contamination was ongoing prior to plagioclase crystallization and continued until eruption. Phenocryst migration between compositionally and thermally distinct reservoirs at crustal depths could explain the heterogeneous plagioclase (An and (87Sr/86Sr)i) of Luenha picrites.
Culture and the creative economy as builders of society in Kenya

Aroalho, Sari (2021)

Africa has recently increased its share of the global market, and the continent’s potential has been recognized globally. The continent has experienced a lot of oppression and forced changes in history, and it is currently developing its new identity with relatively young states and its fast-growing population. African Union (AU) is calling pan-African ideology to bring together the African people in their blueprint and master plan Agenda 2063, where the cultural heritage is at the core. Culture is also at the core of the creative economy, and the creative economy's share of the global economy is growing. Due to globalization and digitalization, the knowledge from other cultures is spreading rapidly, which is the basis of a cultural shift both at local and global levels. This research investigated the culture and the creative economy as builders of society in Kenya. Kenya has been very successful in the field of Information and Communications Technology (ICT), the state takes its cultural heritage seriously in its development programs and their focus is especially on the potential of the youth in the creative economy. Kenya has a vast cultural diversity in the state with its officially recognized 44 tribes. This cultural diversity plays a significant role in the creative economy. According to the United Nations Conference on Trade and Development (UNCTAD, 2020), the creative economy has no single meaning, as the concept is constantly evolving. The basic elements of the concept are from human creativity, ideas, and intellectual property, knowledge and technology. The creative industries include such as music, film, video, arts and crafts and performing arts. These elements are the basis of the creative economy, in addition, they have a significant commercial and cultural value. The research was conducted in Kenya during January and February 2021, and the data was collected from two main geographical research areas, the city of Nairobi and Taita-Taveta County. The geographical research areas were chosen by their cultural diversity, the creative economy and their urban and rural statuses. Nairobi has a classification of a creative city where the digital creative economy is booming, and the city is attracting people around East Africa. Taita-Taveta respectively is a rural county near the Kenyan coast, where the creative economy is mainly in the traditional form, for example, crafting and basket making. The research combined the elements from the ethnographical, hermeneutical and critical approaches by using unstructured, structured interviews and observation, as the methods combined qualitative methods with numerical data. The results show that the culture and the creative economy do build the society in Kenya. It is seen in each level of society, for example, among the families, tribes, counties and even the government. Each level influences and controls the way culture and the creative economy build the society in Kenya. The meaning of the community arose in culture and the creative economy shifts, as they provide help in the mitigation and adaptation into new situations. With the exponential population growth, the share of the youth is rising, culture and the creative economy have the potential to provide jobs for the youth in the future. There are challenges with culture and the creative economy in Kenya. First, to preserve the cultural diversity in Kenya among the youth. Second, to target the governmental policies to the right actions and towards the right groups, which would then support the sector itself. Due to attitude shifts, the role of the youth is a significant point to consider. Furthermore, there is a vast gap between the government and the community, which causes a lot of harm to the creative economy, as the policies do not support the creative sector. If these significant points are solved, there is a vast potential for the culture and the creative economy to continue building the society in Kenya.
Cumulative probability of a false-positive screening result in the Finnish breast cancer screening program

Siljander, Ilona (2016)

The purpose of this thesis is to study the cumulative probability of a false-positive (FP) test result during the Finnish 20-year breast cancer screening program. This study is based on breast cancer screening data provided by the Mass Screening Registry of the Finnish Cancer Registry, which consists of women aged 50–51 years at the time of their first invitation to mammography screening in 1992–1995. Generalized estimating equations (GEE) are used to estimate the cumulative probability of a FP screening result. In the theoretical part we present the corresponding theory together with reviewing the theory of generalized linear models (GLM). The cumulative probabilities are calculated from the modeling of individual examinations by using the theory and formulas of conditional probability. The confidence intervals (Cl) are calculated by using Monte Carlo simulation relying on the asymptotic properties of the GEE estimates. The estimated cumulative risk of at least one FP during the screening program was 15.84% (95% Cl: 15.49–16.18%). Previous FP findings increased the risk of (another) FP results with an odds ratio (OR) of 1.91 (95% Cl: 1.78–2.04), and OR 3.09 (95% Cl: 2.49–3.83) for one or more previous FP results, respectively. Irregular screening attendance increased the risk of FP results with an OR of 1.46 (95% Cl: 1.37–1.56).
Cumulonimbus cloud detection with weather radar at Helsinki-Vantaa airport

Tuomola, Laura (2021)

Cumulonimbus (Cb) clouds form a serious threat to aviation as they can produce severe weather hazards. Therefore, it is important to detect Cb clouds as well as possible. Finnish Meteorological Institute (FMI) provides aeronautical meteorological services in Finland, including METeorological Aerodrome Report (METAR). METAR describes weather at the aerodrome and its vicinity. Significant weather is reported in METARs, and therefore Cb clouds must be included in it. At Helsinki-Vantaa METARs are done manually by human observer. Sometimes Cb detection can be more difficult, for example, when it is dark, and it is also expensive to have human observers working around the clock all year round. Therefore, automation of Cb detection is a topical matter. FMI is applying an algorithm that uses weather radar observations to detect Cb clouds. This thesis studies how well the algorithm can detect Cb clouds compared to manual observations. The dataset used in this thesis contains summer months (June, July and August) from 2016 to 2020. Various verification scores can be calculated to analyse the results. In addition, daytime and night-time differences are calculated as well as different years and months are compared together. The results show that the algorithm is not adequate to replace human observers at Helsinki-Vantaa. However, the algorithm could be improved, for instance, by adding satellite observations to improve detection accuracy.
Customer-oriented data storages in cloud computing

Rasooli Mavini, Zinat (2014)

Massive improvements of the services in the public cloud provide many opportunities for online users. One of the most valuable services of this virtual place is the infrastructure to store data in distributed storages. The public cloud storages let different organizations and enterprises to use the high availability of data, in a cost efficient way, with lowered maintenance burden. However, utilizing the large scale capacity of (public) cloud storages is not mature trend yet among the individual customers, businesses, and organizations. The cloud storages are still unreliable places for the sensitive and confidential information or back-up copies from trust and privacy perspective. Hence, some public and private organizations, universities, as well as ordinary citizens avoid uploading their critical files to the cloud. The thesis suggests the idea of customer-oriented data storages as a solution to the shortcomings of public cloud storages. This idea would be a new way to customize the cloud storages which bears more involvement of the customer in managing aspects, as a solution to the current distrust issue on the cloudbased storages and would be a great courage to different type of customers. Furthermore, the thesis evaluates feasibility of the proposed customer-oriented cloud storage architecture based on scenarios inspired from the Architecture Tradeoff Analysis Method (ATAM) evaluation approach. Results of the evaluating discussion on the proposed solution in boosting trust in cloud storages and providing more control for cloud storage customers are presented.
Customer Segmentation with Subscription-based Online Media Customers

Haatanen, Henri (2022)

In the modern era, using personalization when reaching out to potential or current customers is essential for businesses to compete in their area of business. With large customer bases, this personalization becomes more difficult, thus segmenting entire customer bases into smaller groups helps businesses focus better on personalization and targeted business decisions. These groups can be straightforward, like segmenting solely based on age, or more complex, like taking into account geographic, demographic, behavioral, and psychographic differences among the customers. In the latter case, customer segmentation should be performed with Machine Learning, which can help find more hidden patterns within the data. Often, the number of features in the customer data set is so large that some form of dimensionality reduction is needed. That is also the case with this thesis, which includes 12802 unique article tags that are desired to be included in the segmentation. A form of dimensionality reduction called feature hashing is selected for hashing the tags for its ability to be introduced new tags in the future. Using hashed features in customer segmentation is a balancing act. With more hashed features, the evaluation metrics might give better results and the hashed features resemble more closely the unhashed article tag data, but with less hashed features the clustering process is faster, more memory-efficient and the resulting clusters are more interpretable to the business. Three clustering algorithms, K-means, DBSCAN, and BIRCH, are tested with eight feature hashing bin sizes for each, with promising results for K-means and BIRCH.
DAGLAP: protokollasto pilvipalvelimen sijainnin todentamiseen

Hippeläinen, Sampo (2022)

One of the problems with the modern widespread use of cloud services pertains to geographical location. Modern services often employ location-dependent content, in some cases even data that should not end up outside a certain geographical region. A cloud service provider may however have reasons to move services to other locations. An application running in a cloud environment should have a way to verify the location of both it and its data. This thesis describes a new solution to this problem by employing a permanently deployed hardware device which provides geolocation data to other computers in the same local network. A protocol suite for applications to check their geolocation is developed using the methodology of design science research. The protocol suite thus created uses many tried-and-true cryptographic protocols. A secure connection is established between an application server and the geolocation device, during which the authenticity of the device is verified. The location of data is ensured by checking that a storage server indeed has access to the data. Geographical proximity is checked by measuring round-trip times and setting limits for them. The new solution, with the protocol suite and hardware, is shown to solve the problem and fulfill strict requirements. It improves on the results presented in earlier work. A prototype is implemented, showing that the protocol suite can be feasible both in theory and practice. Details will however require further research.
Dagumin jakauma

Lehto, Susanna (2015)

Dagumin jakauma on jatkuva todennäköisyysjakauma, joka on saanut nimensä Camilo Dagumin mukaan tämän esitellessä jakaumaa 1970-luvulla. Dagumin jakauman kehittäminen sai alkusysäyksen, kun Camilo Dagum ei ollut tyytyväinen jo olemassa oleviin todennäköisyysjakaumiin ja alkoi kehitellä vaatimuksiaan vastaavaa mallia. Tämän kehitystyön tuloksena syntyi kolme jakaumaa, joita kutsutaan Dagumin jakauman tyypeiksi I—III. Tyyppi I on kolme parametria sisältävä jakauma, kun taas tyypit II ja III ovat keskenään hyvin samankaltaisia, neljä parametria sisältäviä jakaumia. Dagumin jakauma tyypistä riippumatta kehitettiin kuvaamaan henkilökohtaisia tuloja, ja tämän vuoksi jakauma yhdistetään usein taloustieteen tulonjako-oppiin. Lisäksi Dagumin jakauman kolme tyyppiä voidaan luokitella tilastollisiksi kokojakaumiksi, joita usein hyödynnetään etenkin taloustieteessä ja vakuutusmatematiikassa. Luku 1 koostuu johdannosta, jossa esitellään pro gradu -tutkielman rakenne pääpiirteissään sekä valotetaan syitä, miksi juuri Dagumin jakauma valikoitui tutkielman aiheeksi. Luvussa 2 esitellään lyhyesti jatkuvien todennäköisyysjakaumien yleistä teoriaa siltä osin kuin sen tunteminen on vähintäänkin tarpeellista. Tässä yhteydessä esitellään myös tärkeitä merkintöjä erityisesti luvun 3 ymmärtämiseksi. Luku 3 alkaa Dagumin jakauman kehittäjän, Camilo Dagumin, henkilöhistorialla. Tästä päästään sujuvasti syihin, jotka motivoivat Dagumia entistä paremman mallin etsimiseen ja johtivat lopulta kokonaan uuden jakauman tai jakaumaperheen syntymiseen. Aivan tuulesta Dagumin jakaumaa ei kuitenkaan ole temmattu, vaan pohjalla on Dagumin laaja-alainen asiantuntemus ja useiden eri jakaumien ja mallien tutkiminen ja testaaminen. Vaikka Dagumin jakauma tyyppeineen on aivan oma jakaumansa, sillä on myös läheisiä yhteyksiä muihin jakaumiin ja näiden yhteyksien vuoksi siitä käytetään usein myös nimeä Burr III -jakauma. Luvussa 3 valotetaan lisäksi Dagumin jakauman perusominaisuuksia, joiden esittelyn myötä katse suunnataan jakauman käyttökelpoisuuteen sovelluksissa: jakauma osoittautuu hyödylliseksi tulonjaon tasa-arvoisuuden mittaamisessa, jossa myös estimoinnilla ja päätelmien tekemisellä on tärkeä rooli. Luvun lopussa käsitellään lyhyesti ja ytimekkäästi Dagumin jakauman käyttämistä tietokoneohjelmien avulla. Vaikka luvussa 3 viitataan monessa kohtaa Dagumin jakauman sovelluksiin, vasta luvussa 4 jakauman soveltaminen käytäntöön otetaan lähempään tarkasteluun. Viimeisessä luvussa kootaan päällimmäisiä ajatuksia ja mietteitä Dagumin jakaumasta sekä haasteista tutustua siihen: yhdessä pro gradussa pystytään vasta raapaisemaan pintaa, joten työsarkaa riittäisi muillekin jakaumasta kiinnostuneille.
Dairy cow re-identification and tracking using computer vision

Fred, Hilla (2022)

Improving the monitoring of health and well-being of dairy cows through the use of computer vision based systems is a topic of ongoing research. A reliable and low-cost method for identifying cow individuals would enable automatic detection of stress, sickness or injury, and the daily observation of the animals would be made easier. Neural networks have been used successfully in the identification of cow individuals, but methods are needed that do not require incessant annotation work to generate training datasets when there are changes within a group. Methods for person re-identification and tracking have been researched extensively, with the aim of generalizing beyond the training set. These methods have been found suitable also for re-identifying and tracking previously unseen dairy cows in video frames. In this thesis, a metric-learning based re-identification model pre-trained on an existing cow dataset is compared to a similar model that has been trained on new video data recorded at Luke Maaninka research farm in Spring 2021, which contains 24 individually labelled cow individuals. The models are evaluated in tracking context as appearance descriptors in Kalman filter based tracking algorithm. The test data is video footage from a separate enclosure in Maaninka and a group of 24 previously unseen cow individuals. In addition, a simple procedure is proposed for the automatic labeling of cow identities in images based on RFID data collected from cow ear tags and feeding stations, and the known feeding station locations.
Dark Matter Via Kinetic Mixing

Sassi, Sebastian (2019)

When the standard model gauge group SU(3) × SU(2) × U(1) is extended with an extra U(1) symmetry, the resulting Abelian U(1) × U(1) symmetry introduces a new kinetic mixing term into the Lagrangian. Such double U(1) symmetries appear in various extensions of the standard model and have therefore long been of interest in theoretical physics. Recently this kinetic mixing has received attention as a model for dark matter. In this thesis, a systematic review of kinetic mixing and its physical implications is given, some of the dark matter candidates relying on kinetic mixing are considered, and experimental bounds for kinetic mixing dark matter are discussed. In particular, the process of diagonalizing the kinetic and mass terms of the Lagrangian with a suitable basis choice is discussed. A rotational ambiquity arises in the basis choice when both U(1) fields are massless, and it is shown how this can be addressed. BBN bounds for a model with a fermion in the dark sector are also given based on the most recent value of the effective number of neutrino species, and it is found that a significant portion of the FIMP regime is excluded by this constraint.
Data-assimilaatiomenetelmistä ja niiden soveltamisesta ROSE-kemiakuljetusmalllin

Hakkarainen, Janne (Helsingin yliopistoHelsingfors universitetUniversity of Helsinki, 2009)

Data-assimilaatio on tekniikka, jossa havaintoja yhdistetään dynaamisiin numeerisiin malleihin tarkoituksena tuottaa optimaalista esitystä esimerkiksi ilmankehän muuttuvasta tilasta. Data-assimilaatiota käytetään muun muassa operaativisessa sään ennustamisessa. Tässä työssä esitellään eri data-assimilaatiomenetelmiä, jotka jakautuvat pääpiirteittäin Kalmanin suotimiin ja variaatioanaalisiin menetelmiin. Lisäksi esitellään erilaisia data-assimilaatiossa tarvittavia apuvälineitä kuten optimointimenetelmiä. Eri data-assimilaatiomenetelmien toimintaa havainnollistetaan esimerkkien avulla. Tässä työssä data-assimilaatiota sovelletaan muun muassa Lorenz95-malliin. Käytännön data-assimilaatio-ongelmana on GOMOS-instrumentista saatavan otsonin assimiloiminen käyttäen hyväksi ROSE-kemiakuljetusmallia.
Data-assimilaatio puoliempiirisessä termosfäärimallissa

Iipponen, Juho (2017)

Ei ole olemassa mallia, joka pystyisi täydellisesti kuvaamaan monimutkaisen ja kaoottisen ilmakehän käyttäytymistä. Siksi mallien ennusteita on korjattava lähemmäksi ilmakehän todellista tilaa havaintojen avulla. Tässä työssä puoliempiirisen yläilmakehämallin kuvausta termosfäärin kokonaismassatiheydestä yritetään tarkentaa data-assimilaation keinoin. Mallitilan korjaamiseksi käytetään ensemble Kalman -suodinta, joka on osoittautunut hyödylliseksi työkaluksi alailmakehän data-assimilaatiojärjestelmissä. Troposfääristä poiketen termosfäärin tilan ennustamisen epävarmuus liittyy kuitenkin pitkälti epävarmuuteen termosfäärin tilaa ajavista pakotteissa. Ionosfäärin ja Auringon UV-säteilyn äkkinäiset ja vaikeasti ennustettavat muutokset voivat nopeasti muuttaa yläilmakehän tilaa tavalla, jonka aikakehitys on suurelta osin riippumaton termosfäärijärjestelmän alkutilasta. Näin ollen ei ole lainkaan selvää, että data-assimilaatio tarkentaisi mallien analyysiä tai ennustetta. Tämän työn tavoitteena on tutkia, tarkentaako havaintojen avulla korjattu malli keski- ja ylätermosfäärin massatiheydestä tehtävää analyysiä verrattuna malliin, jonka tilaa muuttavat ainoastaan yläilmakehäjärjestelmään kohdistetut pakotteet. Lisäksi tarkastellaan, onko tehdyllä analyysillä ennustearvoa seuraavan kolmen päivän aikana tehtäviin massatiheysmittauksiin nähden. Tutkimusjaksona käytetään vuotta 2003, jolloin pakotteet olivat voimakkaita ja niiden muutokset nopeita. Havaintoaineisto on tuotettu algoritmilla, joka laskee yläilmakehän tiheyden matalan maan kiertoradan satelliittien radoissa havaittujen muutosten avulla. Vaikka aineiston ajallinen erotuskyky on melko huono suhteessa pakotteiden ajamien muutosten nopeuteen, osoittautuu, että sen avulla voidaan tarkentaa yläilmakehämallin analyysiä. Sen sijaan käy ilmi, ettei näin korjatun mallitilan avulla kyetä ennustamaan järjestelmän tilan kehitystä, vaikka termosfääriä ajavien pakotteiden aikakehitys olisi tarkkaan tiedossa. Tämän arvellaan johtuvan siitä, että analyysin avulla tuotettu korjaus on voimakkaasti riippuva pakotteen muutoksista sen ajanjakson aikana, jolta havaintoja analyysiä varten kerätään. Näin ollen korjaus ei ole enää paras mahdollinen seuraavien päivien aikana, jolloin ilmakehän tila on pakotteiden seurauksena muuttunut. Ensemble Kalman -suotimen analyysiin tuoma tarkennus, vaikkakin tilastollistesti merkitsevä, ei ole kovin suuri. Pakotteisiin ja havaintoaineistoon liittyvien epävarmuuksien lisäksi on mahdollista, että suotimen suorituskykyä heikentävät mallin ennakkokentässä esiintyvät harhaanjohtavat korrelaatiot, tai työssä käytetty hyvin yksinkertainen kovarianssin inflaatiomenetelmä.
Data assimilation using the Ensemble Adjustment Kalman ﬁlter with application to soil organic carbon modelling

Laine, Maisa (2019)

Data assimilaatio on estimointi menetelmä, jolla voidaan yhdistää informaatiota useista eri lähteistä. Data assimilaatio menetelmien hyödyllisyys näkyy erityisesti kun yhdistetään epäsuoria havaintoja mallin tilaan. Tässä tutkielmassa keskitytään sekventiaalisiin data assimilaatio menetelmiin, jotka pohjautuvat Kalman filter -menetelmään. Kalman filter -menetelmä johdetaan Bayesin kaavasta ja sen pohjalta esitellään ensemble-menetelmiä, jotka usein ovat laskennallisesti kevyempiä approksimaatiota Kalman filter -menetelmästä. Tutkielmassa sovelletaan Ensemble Adjustment Kalman filter -menetelmään orgaanisen maahiilen hajoamista kuvaavaan Yasso-malliin. Yasson avulla mallinnetaan pitkäaikaista maahiiltä kuudelta eri pellolta. Ennusteita parannetaan data assimilaation avulla yhdistämällä ennusteeseen mittauksista saatu informaatio.
Data-centric API configuration : inconsistency detection and diagnosis

Bui, Minh (2021)

Background. In API requests to a confidential data system, there always are sets of rules that the users must follow to retrieve desired data within their granted permission. These rules are made to assure the security of the system and limit all possible violations. Objective. The thesis is about detecting the violations of these rules in such systems. For any violation found, the request is considered as containing inconsistency and it must be fixed before retrieving any data. This thesis also looks for all diagnoses of inconsistencies requests. These diagnoses promote reconstructing the requests to remove any inconsistency. Method. In this thesis, we choose the design science research methodology to work on solutions. In this methodology, the current problem in distributing data from a smart building plays as the main motivation. Then, system design and development are implemented to prove the found solutions of practicality, while a testing system is built to confirm its validity. Results. The inconsistencies detection is considered as a diagnostic problem, and many algorithms have been found to resolve the diagnostic problem for decades. The algorithms are developed based on DAG algorithms and preserved to apply on different purposes. This thesis is based on these algorithms and constraint programming techniques to resolve the facing issues of the given confidential data system. Conclusions. A combination of constraint programming techniques and DAG algorithms for diagnostic problems can be used to resolve inconsistencies detection in API requests. Despite the need on performance improvement in application of these algorithms, the combination works effectively, and can resolve the research problem.
Data-driven Language Typology

Hinkka, Atte (2018)

In this thesis we use statistical n-gram language models and the perplexity measure for language typology tasks. We interpret the perplexity of a language model as a distance measure when the model is applied on a phonetic transcript of a language the model wasn't originally trained on. We use these distance measures for detecting language families, detecting closely related languages, and for language family tree reproduction. We also study the sample sizes required to train the language models and make estimations on how large corpora are needed for the successful use of these methods. We find that trigram language models trained from automatically transcribed phonetic transcripts and the perplexity measure can be used for both detecting language families and for detecting closely related languages.
Data Gathering in Digital Homes

Ray, Debarshi (2012)

Pervasive longitudinal studies in people's intimate surroundings involve gathering data about how people behave in their various places of presence. It is hard to be fully pervasive as it has traditionally required sophisticated instrumentation that may be difficult to acquire and prohibitively expensive. Moreover, setting up such an experiment is laborious. We present a system, in the form of its requirements, design and implementation, that is primarily aimed at collecting data from people's homes. It aims to be as pervasive as possible, and can collect data about a family in the form of audio and video feed from microphones and cameras, network logs and home appliance (eg., TV) usage patterns. The data is then transported over the Internet to a server placed in the close proximity of the researcher, while protecting it from unauthorised access. Instead of instrumenting the test subjects' existing devices, we build our own integrated appliance which is to be placed inside their houses, and has all the necessary features for data collection and transportation. We build the system using cheap off-the-shelf commodity hardware and free and open source software, and evaluate different hardware and software configurations to see how well they can be integrated and how performant or reliable they are in real life scenarios. Finally, we demonstrate a few simple techniques that can be used to analyze the data to gain some insights into the behaviour of the participants.
Datamusikalisaatio

Tulilaulu, Aurora (2017)

Pro gradu -tutkielmassani esittelen datan perusteella ohjattavaa automaattista säveltämistä eli datamusikalisaatiota. Datamusikalisaatiossa on kyse datasta löytyvien muuttujien kuulumisesta automaattisesti sävelletyssä musiikissa. Tarkoitus olisi, että musiikki toimisi korville tarkoitetun visualisaation tavoin havainnollistamaan valittuja attribuutteja datasta. Erittelen tutkielmassa erilaisia tapoja, miten sonifikaatiota ja automaattista tai koneavustettua säveltämistä on tehty aikaisemmin sekä millaisia sovelluksia niillä on. Käyn läpi yleisimmät käytetyt tavat generoida musiikkia, kuten tyypillisimmät stokastiset menetelmät, kieliopit ja koneoppimiseen perustuvat menetelmät. Kerron myös lyhyesti sonifikaatiosta eli datan suorasta kuvaamisesta äänisignaalina ilman musiikillista elementtiä. Kommentoin erilaisten menetelmien vahvuuksia ja heikkouksia. Käsittelen lyhyesti myös sitä, mihin asti automatisoidussa säveltämisessä ja sen uskottavuudessa ihmisarvioijien silmissä on pisimmillään päästy. Käytän esimerkkinä muutamia erilaisia tunnustusta saaneita säveltäviä ohjelmia. Käsittelen kahta erilaista tekemääni musikalisaatio-ohjelmaa. Ensimmäinen generoi kappaleita tiivistäen käyttäjän yhdestä nukutusta yöstä kerätyn datan neljästä kahdeksaan minuuttia kestävään kappaleeseen. Toinen tekee musiikkia reaaliaikaisesti ja muutettavien parametrien pohjalta, jolloin sen pystyy kytkemään toiseen ohjelmaan, joka analysoi dataa ja muuttaa parametreja. Käsitellyssä esimerkissä musiikki tuotetaan keskustelulokin pohjalta ja esimerkiksi keskustelun sävy ja nopeus vaikuttavat musiikkiin. Käyn läpi tekemieni ohjelmien periaatteet musiikin generoimiselle. Käsittelen myös tehtyjen päätösten syitä käyttäen musiikin teorian ja säveltämisen perusteita. Selitän, millaisilla periaatteilla käytetty data kuuluu tai voidaan saada kuulumaan musiikissa, eli miten musikalisaatio eroaa tavallisesta konesäveltämisestä ja sonifikaatiosta, sekä miten se asettuu näiden kahden jo olemassa olevan tutkimuskentän rajoille. Lopuksi esittelen myös käyttäjäkokeiden tulokset, joissa käyttäjiä on pyydetty arvioimaan keskustelulokeista tehdyn musikalisaation toimivuutta, ja pohdin saatujen tulosten ja alan nykytilan pohjalta musikalisaation mahdollisia sovelluskohteita ja mahdollista tulevaa tutkimusta, jota aiheesta voisi tehdä.
Data Platform for Accelerating Machine Learning Workflows on Fusion Data

Jurinec, Fran (2023)

This thesis explores the applicability of open-source tools on addressing the challenges of data-driven fusion research. The issue is explored through a survey of the fusion data ecosystem and exploration of possible data architectures, which were used to derive the goals and requirements of a proof-of-concept data platform. This platform, developed using open-source software, namely InvenioRDM and Apache Airflow, enabled transforming existing machine learning (ML) workloads into reusable data-generating workflows, and the cataloging of resulting clean ML datasets. Through a survey of the fusion data ecosystem, a set of challenges and goals was established for the development of a fusion data platform. It was identified that many of the challenges for data-driven research stem from a heterogeneous and geographically scattered source data layer combined with a monolithic approach to ML research. These challenges could be alleviated through improved ML infrastructure, for which two approaches were identified: a query-based approach, which offers more data retrieval flexibility but requires improvements in querying functionality and source data access speeds, and a persisted dataset approach, which uses a centralized workflow to collect and clean data, but requires additional storage resources. Additionally, by cataloging metadata in a central location it would be possible to combine data discovery across heterogeneous sources, combining the benefits of various infrastructure developments. Building on these identified goals and the metadata-driven platform architecture, a proof-of-concept data platform was implemented and examined through a case study. This implementation used InvenioRDM as a metadata catalog to index and provide a dashboard for discovering ML-ready datasets, and Apache Airflow as a workflow orchestration platform to manage the data collection workflows. The case study, grounded in real-world fusion ML research, showcased the platform's ability to convert existing ML workloads into reusable data-generating workflows and to publish clean ML datasets without introducing significant complexity into the research workflows.
Data privacy in software design : case dLearn.Helsinki

Ahonen, Heikki (2020)

The research group dLearn.Helsinki has created a software for defining the work life competence skills of a person, working as a part of a group. The software is a research tool for developing the mentioned skills of users, and users can be of any age, from school children to employees in a company. As the users can be of different age groups, the data privacy of different groups has to be taken into consideration from different aspects. Children are more vulnerable than adults, and may not understand all the risks imposed to-wards them. Thus in the European Union the General Data Protection Regulation (GDPR)determines the privacy and data of children are more protected, and this has to be taken into account when designing software which uses said data. For dLearn.Helsinki this caused changes not only in the data handling of children, but also other users. To tackle this problem, existing and future use cases needed to be planned and possibly implemented. Another solution was to implement different versions of the software, where the organizations would be separate. One option would be determining organizational differences in the existing SaaS solution. The other option would be creating on-premise versions, where organizations would be locked in accordance to the customer type. This thesis introduces said use cases, as well as installation options for both SaaS and on-premise. With these, broader views of data privacy and the different approaches are investigated, and it can be concluded that no matter the approach, the data privacy of children will always prove a challenge.
Dating groundwater in the surrounding of Sakatti exploration target area in order to understand groundwater recharge and water interactions

Koskimaa, Kuutti (2020)

AA Sakatti Mining Oy is researching the possibility of conducting mining operations in Sakatti ore deposit, located partially under the protected Viiankiaapa mire. In order to understand the waters in mining development site, the interactions of surface waters, shallow aquifers, and deep bedrock groundwaters must be understood. To estimate these interactions, hydrogeochemical characterization, together with four tracer methods were used: Tritium/helium, dichlorodifluoromethane and sulfur hexafluoride, stable isotopes of hydrogen and oxygen, and carbon-14. Most of the shallow groundwater samples are similar to the natural precipitation and groundwater in their chemical composition, being of Calcium bicarbonate type. B-11-17HYD013 was an exception, containing much more Cl and SO4. The samples from the deep 17MOS8193 all show a very typical composition for this type of a borehole, on the line between the saline Sodium sulphate and Sodium chloride water types. The samples from the 12MOS8102, as well as the river water samples and the Rytikuru spring sample are located between these two end members. The hydrogen and oxygen isotope values divided the samples into two distinct groups: those that show evaporation signal in the source water, and those that do not. The most likely source for the evaporated signal in the groundwaters is in the surface water pools in the Viiankiaapa mire, which have then infiltrated into the groundwater and followed the known groundwater flow gradient into the observation wells near the River Kitinen. Tritium showed no inclusion of recently recharged water in the deep 17MOS8193, and dated most of the shallow wells with screen below bedrock surface to be recharged in the 70’s and 80’s. B-10-17HYD017 had an older apparent age from 1955, and B-14-17HYD006 was curiously dated to be recharged in 2018. 14C gave apparent age of over 30 000 a for the deep 17MOS8193. The slight contents of 14C could be caused by slight contamination during sampling meaning the age is a minimum. The sample M-4-12MOS8102 got an apparent age of ~3 500 a, which could in turn be an overestimate due to ancient carbon being dissolved from the local bedrock fractures. CFC-12 showed apparent recharge dates from 1963 to 1975 in the shallow wells, and no recently recharged water in the deep 17MOS8193, and so was generally in line with the 14C and Tritium results, although some contamination had happened. SF6 concentrations exceeded possible concentrations considering other results, most likely due to underground generation, and the method was dismissed. By trace element composition, all samples from the deep 17MOS8139 are distinct from other samples and saw slight dilution in concentrations of most elements in the span of the test pumping. Other samples are more mixed and difficult to interpret, but some trends and connections are visible, such as the higher contents in wells with screens below the bedrock surface than those with screens above the bedrock surface, and the exceptionally high contents of many elements in B-13-17HYD004. Overall, the study did benefit from the large array of methods, showing no interaction between the deep bedrock groundwaters and shallow groundwaters or surface waters. The evaporated signal from the Viiankiaapa was clearly visible in the samples close to the River Kitinen.

Now showing items 687-706 of 4026

Browsing by Title

Yhteystiedot

HELSINGIN YLIOPISTO