Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "big data"

Sort by: Order: Results:

  • Perola, Eero (2023)
    Driving speeds regardless of vehicle type are a part of almost everyone’s daily lives. The subject has been widely studied and many algorithms for determining optimal routes exist. A novel data source for this type of research is GPS-collected Floating Car Data. As positioning enabled devices have become increasingly abundant, the collection of huge amounts of data with locations, speeds and directions has become vastly more common. In this master’s thesis, I examine a type of Big Data -set of car speeds within the Helsinki area through three different viewpoints. First, I examine the driving patterns described by the distribution of data on different kinds of roads and time periods. Second, I focus on one variable, intersection density, and determine the effect it has on the change in speed and whether it is possible to conduct statistical analysis for the data. Last, I analyze the steps needed to take in order to fully utilize the variables of the data within the road network system. The results indicate that while there are clear differences in changing speed within road classes, the differences are not as clearly described by road class as they are by speed limit. Also, time of day has a clear effect where times of congestion are distinguishable. While among all road classes the mean driven speed is below the speed limit, on larger roads the mode is above the speed limit. I prove that it is possible to find numerous variables that depict speed change through novel Floating Car Data. Focusing on intersection density, the result is that at highest, within the Helsinki area, intersection density represents around eight per cent of change in speed compared to speed limit. As a final result, a method to viably use linear Floating Car Data to research intersection density and its effects is developed. As a mediate step and a side result, a workflow of modifying road network layers into segments between intersections is produced.
  • Karhu, Teemu (2020)
    Suomea pidetään ensisijaisesti luontomatkailun kohteena. Luonnon vetovoiman merkitys kuitenkin vaihtelee niin tutkimusten kuin kansallisuuksien ja yksilöidenkin välillä. Matkailun vetovoimakohtaista kysyntää on tutkittu muun muassa haastattelututkimuksin, mutta perinteisillä tutkimusmenetelmillä tarkasteltuna vetovoiman kysynnän ja tarjonnan spatiaalista kohtaamista ei ole voitu selvittää. Uudet, suuriin tietomassoihin perustuvat tutkimusmenetelmät mahdollistavat kokonaan uudenlaisen tutkimuksen. Matkaviestinten käytöstä syntyvät lokitiedot muodostavat tietolähteen, johon perustuen matkaviestinlaitteen käyttäjiä voidaan jäljittää sekä ajassa että paikassa. Matkaviestimet toimivat potentiaalisena aineistolähteenä matkailututkimukselle erityisesti matkailijoiden reittien ja preferenssien esiin tuojana. Matkailun kokemukset luovat ihmisille mielihyvää ja tyytyväisyyden tunnetta. Kokemus nähdään matkailussa arvon tuottajana. Arvon yhdessä luonnin teorian mukaan hyödykkeen arvo on asiakkaan siitä saama käyttöarvo. Arvontuottoon vaikuttaa asiakkaan motivaatio, joka matkailussa vertautuu ihmisen henkilökohtaisiin tarpeisiin ja näkyy kiinnostuksena matkakohteeseen. Kohteen valinta omien mielenkiinnon kohteiden perusteella edesauttaa arvonluonnissa. Millä tavalla matkailijoiden todelliset reitit ja vetovoimatekijät kohtaavat? Voiko reittivalinnoista nähdä, että ihmiset matkustavat omien mielenkiinnonkohteidensa mukaisesti? Tutkimuksessa analysoidaan ulkomaisten matkailijoiden käyttämiä matkareittejä Suomessa suhteessa matkailun vetovoimatekijöihin. Vetovoimatekijöiden luokitus perustuu Suomen matkailun aluerakennetutkimukseen. Visit Finlandin matkailijasegmentointi tuo esiin matkailijoiden mielenkiinnon kohteet. Matkailijoiden reitit pohjautuvat DNA Oyj:n matkaviestinaineistoihin. Analyysin perusteella matkailijoiden reitit kohtaavat luonnonvetovoimaisimmat kohteet heikosti, mikä johtuu pääosin matkailun kaupunkikeskeisyydestä. Kohtaavuus reittien ja muiden vetovoimaluokkien välillä on luonnonvetovoimaa parempi. Tulosten perusteella on syytä pohtia, onnistuuko matkailumarkkinointi viestimään ja kohdistamaan viestinsä oikein, ja ymmärretäänkö viesti oikein. Heikko kohtaavuus henkilökohtaisten toiveiden ja todellisuudessa tapahtuneen matkailun välillä indikoi heikkoa arvontuottoa ja sitä kautta matalaa todennäköisyyttä suositella Suomea matkakohteena tai matkustaa uudelleen Suomeen.
  • Massinen, Samuli (2019)
    The Greater Region of Luxembourg is the largest cross-border labor market in the European Union with the greatest number of cross-border workers in the area. European integration, the Schengen Area, and socio-economical divergences have been the main factors facilitating human cross-border movements in the area and thus the birth and expansion of the borderland community. Despite the freedom of movement, country borders have not been erased and socio-economic divergences have not been levelled. In addition, the spatial extent of the daily movements is not well known. Thus, it is important to study cross-border dynamics and try to separate daily movements from infrequent mobility patterns. Thus far, cross-border mobility studies have mainly leaned on national registers and census data. These datasets have mostly been too scarce in trying to understand the complexities of cross-border mobility. Many studies have only focused on aggregate-level movement patterns, and the viewpoint of individuals has been missing. Hence, there has been a growing need for individual-level data to be applied in cross-border mobility research. In this study, a person-based approach is employed using geotagged Twitter Big Data to study spatio-temporal cross-border mobility patterns in the Greater Region of Luxembourg. The aim is to examine how to implement social media in cross-border research as well as how to separate daily cross-border movers from infrequent border crossers and consequently move beyond aggregate-level inspections. Being one of the first studies of its kind, a heuristic programmatic approach is utilized. To the writer’s knowledge, social media data sources have not been applied previously to distinguish different cross-border mobility types. All developed scripts in this study are openly available on Digital Geography Lab’s GitHub -pages (https://github.com/DigitalGeographyLab/cross-border-mobilitytwitter) to promote open science and to introduce new quantitative method tools for cross-border mobility research. The results show that social media can be implemented in cross-border mobility research, and social media Big Data can provide a relatively good proxy for daily cross-border mobility of people on a regional level. Aggregate-level cross-border mobility patterns and activity location densities correspond closely with previous studies, and outcomes from temporal variation inspections indicate a valid cross-border mover type identification; Twitter users classified as daily cross-border movers seem to be more mobile on weekdays whereas infrequent border crossers on weekends. Daily cross-border mobility patterns also provided new information about the spatial extent of the movements. In addition, heuristic approach resulted in high accuracy in home detection; the “unique weeks” algorithm introduced in this study produced an accuracy of 88.6 % with respect to the ground truth. Although the results are promising on a regional level, they should be considered in relation to population densities and Twitter use activity; attributes that both vary spatio-temporally and thus can cause bias. Further studies and method development are also needed to draw global conclusions about cross-border mobility; other geographical areas and study settings could result in varied outcomes. In addition, some solutions with data and methods should be considered with a critical stance due to scarcity of valid references. Yet, this study has identified that the coverage of geotagged Twitter data is dependent on data acquisition processes and that Twitter can provide valuable information for cross-border mobility research. In future studies, multi-level data acquisition processes are recommended jointly with person-based approach combining spatio-temporal and content analysis methodologies.
  • Laaksonen, Iivari (2022)
    Multi-local living is a complex social phenomenon that is tightly connected to human mobility. In previous research, the phenomenon has been mainly researched with official statistics that fail to capture the dynamic nature of people’s mobilities and dwelling. This thesis approaches multi-locality in Finland and in the county South Savo from the perspective of second homes with novel data sources like mobile phone data and electricity consumption data. These spatially and temporally accurate big data sources can be used to ensure sufficient coverage of population and geographic area. I approach multi-local living by analyzing the spatiotemporal changes in people’s presence with mobile phone data, and by examining how the changes relate to second homes in different areas separately for workdays and weekends. This is examined both for the whole country and by comparing different counties. In the thesis, mobile phone data is utilized as the ground truth to assess the performance of household occupancy detection methods for electricity consumption, and to examine how electricity consumption data captures the spatiotemporal dynamics of second home users in South Savo. The results indicate that people are generally more mobile during the summer, and the seasonal growth in people’s presence correlates strongly with second homes. This shows a prominent seasonal effect for multi-local living in Finland. Additionally, it is shown that the results vary spatially as there is variation in the results both between counties and within South Savo. The best performing second home occupancy detection method is revealed by correlation analyses between mobile phone data and electricity consumption data. Moreover, it is shown that electricity data correlates better with mobile phone data during the summer, and that the data captures the monthly dynamics of second home users well. This further highlights the seasonal effect of multi-local living. The thesis provides valuable insight into how the seasonal variation of population in different areas is connected to multi-local living in Finland. Furthermore, it is shown that novel data sources can capture the changes in people’s presence at multiple spatial levels with high temporal accuracy, and that they can be utilized to study multi-local living.
  • Zubair, Maria (2022)
    The growing popularity of the Internet of Things (IoT) has massively increased the volume of data available for analysis. This data can be used to get detailed and precise insights about users, products, and organizations. Traditionally, organizations collect and process this data separately, which is a slow process and requires significant resources. Over the past decade, data sharing has become a popular trend, where several organizations have engaged in sharing their collected data with other organizations and processing it together for analysis. Digital marketplaces are developed to facilitate this data sharing. These marketplaces connect producers and consumers of data while ensuring that the data can be shared inside and outside the organization seamlessly and securely. This is achieved by implementing a fine-grained and efficient data access control method that restricts access to the data for authorized parties only. The data generated by IoT devices is voluminous, continuous, and heterogeneous. Therefore, traditional access control methods are no longer suitable for managing access to this data in a digital marketplace. IoT data requires an access control model, which can handle large volumes of streaming data, and provides full control transparency of data access to IoT device owners. In this thesis, we have designed and implemented a novel access control mechanism for a data distribution system developed by Nokia Bell Labs. We have outlined the requirements for designing an access control system to manage data access for data shared across multiple heterogeneous organizations. We have evaluated the proposed system to assess the feasibility and performance of the system in various scenarios. The thesis also discusses the strengths and limitations of the proposed system and highlights future research perspectives in this domain. We expect this thesis to be helpful for researchers studying IoT data processing, access control methods for streaming (big) data, and digital marketplaces.
  • Hästbacka, Matti (2023)
    The direct economic impacts of the global tourism industry account for 4 % of global GDP and 8 % of global greenhouse gas emissions. The industry is in transformation caused by climate change, political instability and rapid technological development. In addition, the relationship between biodiversity conservation and tourism as well as the growing popularity are considered megatrends impacting the sector. Traditional mass tourism destinations, such as the Canary Islands, may start seeing new kinds of visitors, if traveling to exotic destinations becomes difficult as a result of these transformations. Understanding transformations affecting tourism requires information about tourists’ mobilities, interests and preferences. However, traditional data collection methods may not necessarily be suited for studying quickly changing tourism. The need for Information about visitations to natural and protected areas is especially high, as traditional tourism indicators, such as flights and accommodation statistics do not tell where the tourists spend time. Social media data may enable production of new kind of knowledge and studying nature-based tourism in a new way. In this thesis, I intent to assess the role of nature in tourism in the Canary Islands, Spain using data from the photo-sharing platform Flickr. First, I compare the spatiotemporal patterns of Flickr data against official data about tourism flows to confirm the feasibility of Flickr as a data source in the Canary Islands context. I then try to understand the importance of nature visitations and differences in nature visitation patterns between visitors from different countries. Finally, I turn to analyse contents of the images to see what kinds of nature-related topics are important for each group, making use of a deep learning and cluster detection algorithms. I verify the results of my empirical analysis with data collected through interviewing experts familiar with Canary Islands tourism. The results of my research show that Flickr reflects Canary Islands tourism patterns moderately well, and that it can be used to produce information about differences in nature visitation patterns. Protected areas are shown to be important and central for Canary Islands tourism, but differences in interest toward these areas between groups are notable. Results of the content analyses show that while differences between groups exist, both nature-related content and photos of humans are important in content posted from PAs. Verification data collected through expert interviews shows that the observed differences between groups correspond to the experts’ perceptions about differences between different groups. The findings of my thesis demonstrate the importance of nature and protected areas in Canary Islands tourism and confirm earlier knowledge about the use of Flickr in studying nature visitations. The results may inform future research in the Canary Islands. More broadly, they provide information about the feasibility and limitations of the use of social media data for nature-based tourism research.