Browsing by Issue Date
Now showing items 21-40 of 4261
-
(2024)Sobolev functions generalize the concept of differentiability for functions beyond classical settings. The spaces of Sobolev functions are fundamental in mathematics and physics, particularly in the study of partial differential equations and functional analysis. This thesis provides an overview of construction of an extension operator on the space of Sobolev functions on a locally uniform domain. The primary reference is Luke Rogers' work "A Degree-Independent Sobolev Extension Operator". Locally uniform domains satisfy certain geometric properties, for example there are not too thin cusps. However locally uniform domains can possess highly non-rectifiable boundaries. For instance, the interior of the Koch snowflake represents a locally uniform domain with a non-rectifiable boundary. First we will divide the interior points of the complement of our locally uniform domain into dyadic cubes and use a collection of the cubes having certain geometric properties. The collection is called Whitney decomposition of the locally uniform domain. To extend a Sobolev function to a small cube in the Whitney decomposition one approach is to use polynomial approximations to the function on an nearby piece of the domain. We will use a polynomial reproducing kernel in order to obtain a degree independent extension operator. This involves defining the polynomial reproducing kernel in sets of the domain that we call here twisting cones. These sets are not exactly cones, but have some similarity to cones. Although a significant part of Rogers' work deals extensively with proving the existence of the kernel with the desired properties, our focus will remain in the construction of the extension operator so we will discuss the polynomial reproducing kernel only briefly. The extension operator for small Whitney cubes will be defined as convolution of the function with the kernel. For large Whitney cubes it is enough to set the extension to be 0. Finally the extension operator will be the smooth sum of the operators defined for each cube. Ultimately, since the domain is locally uniform the boundary is of measure zero and no special definition for the extension is required there. However it is necessary to verify that the extension "matches" the function correctly at the boundary, essentially that their k-1-th derivatives are Lipschitz there. This concludes the construction of a degree independent extension operator for Sobolev functions on a locally uniform domain.
-
(2024)The MOOC Center of University of Helsinki maintains a learning management system, primarily used in the online courses offered by the Department of Computer Science. The learning management system is being used in more courses, leading to a need for additional exercise types. In order to satisfy this need, we plan to use additional teams of developers to create these exercise types. However, we would like to minimize any negative effects that the new exercise types may have on the overall system, specifically regarding stability and security. In this work, we propose a plugin system for creating new exercise types, and implement it to production system used by real students. The system's plugins are deployed as separate services and use sandboxed IFrames for their user interfaces. Communication with the plugins occurs through the use of HTTP requests and message passing. The designed plugin system fulfilled its aims and worked in its production deployment. Notably, it was concluded that it is challenging for plugins to disrupt the host system. This plugin system serves as an example that it is possible to create a plugin system where the plugins are isolated from the host system.
-
(2024)Fog has a significant impact on society, by making transportation and aviation industries difficult to operate as planned due to reduced visibility. Studies have estimated that 32 % of marine accidents, worldwide, and 40 %, in the Atlantic Ocean, took place during dense sea fog. Therefore forecasting fog accurately, and allowing society to function, would help mitigate financial losses associated with possible accidents and delays. However, forecasting the complex fog with numerical weather prediction (NWP) models remains difficult for the modelling community. A NWP model typically operates in the resolution of kilometres, when the multiple processes associated with fog (turbulence, cloud droplet microphysics, thermal inversion) have a smaller spatial scale than that. Consequently, some processes need to be simplified and parametrised, increasing the uncertainty, or more computational power is needed to be allocated for them. One of these NWP models is HARMONIE-AROME, which the Finnish Meteorological Institute develops in collaboration with its European colleague institutes. To improve the associated accuracy, a brand new, more complex and expensive, option for processing aerosols in HARMONIE-AROME, is presented. This near-real-time (NRT) aerosol option integrates aerosol concentrations from Copernicus Atmospheric Monitoring Services' NRT forecast into HARMONIE-AROME. The statistical performance of the model's sea fog forecast in the Baltic Sea was studied in a case study using marine observations. The quantitative metric, proportion score, was studied. As a result, a forecast using the NRT option showed a slight deterioration in visibility (0.52 versus 0.59), a neutral improvement in cloud base height (0.52 versus 0.51), and a slight deterioration in 2-meter relative humidity (0.73 versus 0.76) forecasts with respect to the reference option. Furthermore, the score in general remained weak against observations in the case of visibility and cloud base height. In addition, based on qualitative analysis, the spatial coverage of the forecasted sea fog in both experiments was similar to the one observed by the NWCSAF Cloud Type-product. In total, the new aerosol option showed neutral or slightly worse model predictability. However, no strong conclusions should be made from this single experiment sample and more evaluations should be carried out.
-
(2024)Due to its long lifetime and relatively low variability compared with its background values, it is of great significance to precisely measure the concentration of CO2 in the atmosphere. In the high latitude regions, permafrost and boreal forest serve as large carbon reservoirs. Capturing the carbon concentration there helps us understand the process of climate change and provide accurate data to the carbon flux models. However, the measurement there is facing significant challenges. Sparse observation coverage and low-quality data are still major problems to be solved. In this thesis, we are looking into these problems from satellite-based OCO-2 XCO2 retrievals in high latitude regions. XCO2 data acquired above 45°N were used to compare the version updates, validate the results with ground-based TCCON site data and come up with a colocation method for boreal areas trying to tackle the issue caused by slant solar radiation. The comparison of version 10 and version 9 datasets shows improvements of version 10 in data volume and precision level. Yet the changes are not as significant for sites near polar areas. It also reveals that the current advances mainly focus on reducing systematic errors. In the validation with TCCON data, from OCO-2 displays lower seasonal fluctuations. The quality filters are shown to be too tight for boreal sites in filtering lower values. It provides information for new approaches when adjusting the filters. The global distribution of averaged XCO2 reveals that standard deviation is higher for nadir mode land observation in mountain areas. This might be lowered with an improved surface pressure correction method. Averaging kernel correction is applied when comparing with TCCON to standardize the sensitivity profile. It enhances the accuracy of the results and also stresses the significance of integration scheme. A new colocation method is implemented for better locating of TCCON observations in high latitudes but did not return good results. Further adjustments for the algorithm and tests in more areas are needed.
-
(2024)There are two primary types of quantum computers: quantum annealers and circuit model computers. Quantum annealers are specifically designed to tackle particular problems, as opposed to circuit model computers, which can be viewed as universal quantum computers. Substantial efforts are underway to develop quantum-based algorithms for various classical computational problems. The objective of this thesis is to implement algorithms for solving graph problems using quantum annealer computers and analyse these implementations. The aim is to contribute to the ongoing development of algorithms tailored for this type of machine. Three distinct types of graph problems were selected: all pairs shortest path, graph isomorphism, and community detection. These problems were chosen to represent varying levels of computational complexity. The algorithms were tested using the D-Wave quantum annealer Advantage system 4.1, equipped with 5760 qubits. D-Wave provides a cloud platform called Leap and a Python library, Ocean tools, through which quantum algorithms can be designed and run using local simulators or real quantum computers in the cloud. Formulating graph problems to be solved on quantum annealers was relatively straightforward, as significant literature already contains implementations of these problems. However, running these algorithms on existing quantum annealer machines proved to be challenging. Even though quantum annealers currently boast thousands of qubits, algorithms performed satisfactorily only on small graphs. The bottleneck was not the number of qubits but rather the limitations imposed by topology and noise. D-Wave also provides hybrid solvers that utilise both the Quantum Processing Unit (QPU) and CPU to solve algorithms, which proved to be much more reliable than using a pure quantum solver.
-
(2024)Zero Trust -turvallisuusmalli uudistaa tietoverkkojen tietoturva-ajattelua lähtemällä oletukses- ta, ettei mikään tietoverkon vyöhyke ole itsessään turvallinen. Tällöin myös luotettujen verk- kojen sisäisiä tietoliikenneyhteyksiä on tarkasteltava kriittisesti, ja ne on sisällytettävä harkin- nanvaraisen ja minimioikeuksiin perustuvan pääsynhallinnan piiriin. Käsite mikrosegmentointi on ymmärrettävä suhteessa verkon segmentointiin eli sen jakamiseen vyöhykkeisiin. Mikrosegmentti on niin mikroskooppinen vyöhyke, että se on enää yhden isän- täkoneen kokoinen. Mikrosegmentoinnissa jokainen tietoliikenneyhteys, myös kahden saman lähiverkon isäntäkoneen välinen, ylittää vyöhykerajan ja on pääsynvalvonnan alainen. Tässä työssä tutkitaan, että jos virtuaalisesta tietokoneluokasta Zero Trust -mallin mukaisesti sallitaan vain tunnetut yhteydet käyttämällä tähän mikrosegmentoivaa palomuuria, niin kuinka paljon tämä vähentää lähiverkon liikennettä ja estetäänkö samalla jotain olennaista? Teoriaosuudessa esitellään tietoliikenteen perusteet, Defense in Depth ja Zero Trust -käsitteet, palomuurien toimintaperiaatteet, tietokoneiden virtualisointi sekä datakeskusverkkojen toteu- tustapoja. Empiirisessä osuudessa analysoidaan lähiverkon sisäistä liikennettä Helsingin yliopiston etä- työpöytäympäristöstä, jota voi ajatella virtuaalisena etäkäytettävänä tietokoneluokkana. Etä- työpöytäympäristöstä syntyviä lateraalisia yhteyksiä analysoidaan kvantitatiivisesti vertaillen liikennemääriä täsmä-, ryhmä- ja yleislähetysluokissa sekä kvalitatiivisesti tarkastelemalla oh- jelmistoja näiden yhteyksien taustalla. Samalla etsitään vastausta kysymyksiin: mikä tarkoitus näillä yhteyksillä on ja ovatko ne tarpeellisia virtuaalisessa tietokoneluokassa. Tarpeettomien yhteyksien suodattamista pohditaan myös energiansäästön näkökulmasta.
-
(2024)Tämän tutkimuksen tarkoituksena oli selvittää, mitä oppimateriaaleja ja opetusvälineitä lukioiden matematiikan opettajat käyttävät. Lisäksi selvitettiin, kuinka he niitä käyttävät ja mitä mieltä he niistä ovat. Aluksi tutkimuksessa esitellään oppimateriaalien ja opetusvälineiden välisiä eroja ja kuinka niitä voidaan käyttää matematiikan opetuksessa. Tutkimus toteutettiin opettajille suunnatulla kyselytutkimuksella. Kyselyyn osallistuvat lukiot valittiin systemaattisella satunnaisotannalla. Tutkimuksessa havaittiin muun muassa, että opetuksessa sähköinen oppikirja on fyysistä oppikirjaa yleisempi, mutta fyysistä oppikirjaa käytetään sähköistä oppikirjaa enemmän opetuksen suunnittelussa. Eniten käytetty opetusväline on videotykki ja tietokone, mutta yleisin opetusväline on dokumenttikamera. Puolestaan harvinaisin opetusväline on liitutaulu. Tutkimuksessa havaittiin myös se, että opettajat eivät pääsääntöisesti käytä materiaaleja tai opetusvälineitä, joihin he eivät ole tyytyväisiä. Kyselyyn vastasi 57 opettajaa eri lukioista ympäri Suomea. Tämä tutkimus ei siis anna kaiken kattavaa kuvaa kaikkien lukioiden tilanteesta, mutta antaa kuitenkin suuntaa-antavasti tietoa siitä, mitä oppimateriaaleja ja opetusvälineitä lukio-opetuksessa käytetään.
-
(2024)Ylioppilaskirjoitukset ovat suomalaisen koulutuksen kentällä merkittävä ja laajasti vaikuttava instituutio. 2018 tapahtuneen sähköistymisen myötä ylioppilaskirjoituksissa tapahtui paljon muutoksia, joiden vaikutuksia ei ole vielä tutkittu kovin laajalti. Tässä tutkielmassa jatketaan aiemman opettaja työnsä tutkijana -tutkielman aihetta, ja haetaan ymmärrystä sähköistymisen vaikutuksista fysiikan ylioppilaskirjoituksiin. Fysiikan ylioppilaskirjoituksia tarkastellaan tässä tutkielmassa \emph{uudistetun Bloomin taksonomian} kautta. Taksonomia esitellään ja sen aiempaa käyttöä kasvatustieteessä, sekä erityisesti päättöarvioinnin tutkimuksessa tarkastellaan. Kun taksonomia ymmärretään, siirrytään hyödyntämään sitä fysiikan ylioppilaskirjoitusten tarkastelussa. Uudistettua Bloomin taksonomiaa hyödyntäen analysoidaan yhteensä kaksitoista fysiikan ylioppilaskoetta vuosilta 2015 - 2018 ja 2021 - 2023. Kuusi kokeista edustavat paperikokeiden ajan loppua, ja kuusi uusimpia sähköisiä ylioppilaskokeita. Kokeiden tehtävät analysoidaan ja luokitellaan uudistetun Bloomin taksonomian mukaiseen taksonomiatauluun erityisesti huomioiden hyvän vastauksen piirteiden perusteella tunnettua kunkin tehtävän pistejakaumaa. Näin saadaan tietoa erikseen paperisten ja sähköisten fysiikan ylioppilaskokeiden tehtävien mittaamasta kognitiivisesta osaamisesta, sekä niiden vaatimista tiedon tyypeistä. Analyysin perusteella havaitaan, että fysiikan ylioppilaskokeet mittaavat laajasti erilaisia kognitiivisen osaamisen ja tiedon tasoja, kuitenkin selvästi painottuen ymmärtämistä ja soveltamista mittaaviin tehtäviin. Lisäksi havaitaan, että erityisesti painoarvoa on käsitteellisellä tiedolla ja menetelmätiedolla. Vertaamalla paperikokeiden ja sähköisten kokeiden tuloksia havaitaan, että sähköistymisen yhteydessä ymmärtämistä vaativa osuus tehtäväpisteistä on pienentynyt 11,5 %-yksikköä. Myös muita pienempiä muutoksia havaitaan, mutta ne eivät ole tilastollisesti merkitseviä. Lopuksi tutkielmassa arvioidaan muutosten syitä, erityisesti sähköistymisen tarjoaman laajemman työkaluvalikoiman vaikutusta tehtävänlaadintaan. Toisaalta taksonomiasta luonteesta ja tutkielman koejärjestelystä tunnistetaan puutteita, joiden seurauksena tulosten luotettavuus nousee kyseenalaiseksi.
-
(2024)Monoenergetic neutron reference fields are used in neutron metrology for the calibration of different neutron detectors, including dose rate meters. The International Standardization Organization ISO has composed guidelines and requirements for the production of narrow energy spread neutron fields using a particle accelerator. The objective of this Thesis was to investigate a target material that could be used to produce a monoenergetic neutron field by irradiating it with protons. A broader energy distribution was deemed satisfactory in regard to the initial phase of the station’s development, as significant modifications to the beamline would be necessary to acquire more precise beam current values and to achieve proton energies closer to the reaction threshold energy. The target material was chosen to be lithium fluoride (LiF) based on a literature review and Monte Carlo simulations. The simulations were executed with the proton energy of 2.5 MeV, which is close to the threshold energy of the 7Li(p, n)7Be reaction, and with the fixed energy 10 MeV of the IBA cyclotron used to conduct the experiment. The simulations were executed with the MCNP6 code, and the results were compared to those obtained from equivalent Geant4 simulations. The simulations suggested two wide peaks around 3 MeV and 0.6 MeV at the proton energy of 10 MeV. The irradiation experiment included two phases, one of which entailed the use of a shadow cone to estimate the number of scattered neutrons in the neutron yield. The maximum neutron fluence of (2.62 ± 0.78)∙109 s-1 was measured at the pop-up probe current of (8.3 ± 0.8) µA. Gamma spectrometry was utilized after the experiment to further evaluate the number of 7Li(p,n)7Be reactions taken place in the target by calculating the number of 7Be nuclei in the LiF plate. Altogether, lithium fluoride exhibits promising characteristics as a target material for accelerator-based monoenergetic neutron production, although its application demands further considerations regarding for instance, the decrement of the proton energy and the aiming and measurement of the proton beam. These results contribute to the future development of a neutron irradiation station at the University of Helsinki.
-
(2024)Statistician C. R. Rao made many contributions to multivariate analysis over the span of his career. Some of his earliest contributions continue to be used and built upon almost eighty years later, while his more recent contributions spur new avenues of research. This thesis discusses these contributions, how they helped shape multivariate analysis as we see it today, and what we may learn from reviewing his works. Topics include his extension of linear discriminant analysis, Rao’s perimeter test, Rao’s U statistic, his asymptotic expansion of Wilks’ Λ statistic, canonical factor analysis, functional principal component analysis, redundancy analysis, canonical coordinates, and correspondence analysis. The examination of his works shows that interdisciplinary collaboration and the utilization of real datasets were crucial in almost all of Rao’s impactful contributions.
-
(2024)The Venusian atmosphere has everything to be an exciting natural sulfur laboratory. In addition to relatively high concentrations of sulfur dioxide, suitable conditions in the atmosphere make both thermo- and photochemical reactions possible, allowing for complex chemical reactions and the formation of new sulfur containing compounds. These compounds could explain or contribute to the enigmatic 320-400 nm absorption feature in the atmosphere. One of the proposed absorbers is polysulfur compounds. While some experimentally obtained UV-VIS spectra have been published, studying the different polysulfur species individually is extremely difficult due to the reactive nature of sulfur. In this thesis UV-VIS spectra for polysulfur species S2 to S8 were simulated using the nuclear ensemble approach to determine if they fit the absorption profile. In total, 38 polysulfur species were considered. All were optimized at the wB97X-D/aug-cc-pV(T+d)Z level of theory, with the S2, S3, and S4 structures also being optimized at the CCSD(T)/aug-cc-pV(T+d)Z level of theory. For 13 structures UV-VIS spectra were simulated using a nuclear ensemble of 2000 geometries, with vertical excitations calculated at the EOM-CCSD/def2-TZVPD or the wB97X-D/def2-TZVPD levels of theory. The simulated UV-VIS spectra for the smaller species were in quite good agreement with experimental ones. Two different molecules were identified with substantial absorption cross sections in the range of the unknown absorber: The open chain isomer of S3 (3.78×10^-17 cm^2 at 370 nm), and the trigonal isomer of S4 (4.76×10^-17 cm^2 at 360 nm). However, the mixing ratios of these species in the Venusian atmosphere are also needed to make a more conclusive statement. Other polysulfur compounds have insignificant absorption cross sections in the 320-400 nm range and can therefore be excluded. The calculated absorption cross sections can be used to calculate photolysis rates, which can be straight away added to atmospheric models of Venus. In addition, this work will help future space missions to Venus, for example by focusing their search for the unknown absorber.
-
(2024)Quantum computers utilize qubits to store and process quantum information. In superconducting quantum computers, qubits are implemented as quantum superconducting resonant circuits. The circuits are operated only at the two energy states, which form the computational basis for the qubit. To suppress leakage to uncomputational states, superconducting qubits are designed to be anharmonic oscillators, which is achieved using one or more Josephson junctions, a nonlinear superconducting element. One of the main challenges in developing quantum computers is minimizing the decoherence caused by environmental noise. Decoherence is characterized by two coherence times, T1 for depolarization processes and T2 for dephasing. This thesis reviews and investigates the decoherence properties of superconducting qubits. The main goal of the thesis is to analyze the tradeoff between anharmonicity and dephasing in a qubit unimon. Recently developed unimon incorporates a single Josephson junction shunted by a linear inductor and a capacitor. Unimon is tunable by external magnetic flux, and at the half flux quantum bias, the Josephson energy is partially canceled by the inductive energy, allowing unimon to have relatively high anharmonicity while remaining fully protected against low-frequency charge noise. In addition, at the sweet spot with respect to the magnetic flux, unimon becomes immune to first-order perturbations in the flux. The sweet spot, however, is relatively narrow, making unimon susceptible to dephasing through the quadratic coupling to the flux noise. In the first chapter of this thesis, we present a comprehensive look into the basic theory of superconducting qubits, starting with two-state quantum systems, followed by superconductivity and superconducting circuit elements, and finally combining these two by introducing circuit quantum electrodynamics (cQED), a framework for building superconducting qubits. We follow with a theoretical discussion of decoherence in two-state quantum systems, described by the Bloch-Redfield formalism. We continue the discussion by estimating decoherence using perturbation theory, with special care put into the dephasing due to the low-frequency 1/f noise. Finally, we review the theoretical model of unimon, which is used in the numerical analysis. As a main result of this thesis, we suggest a design parameter regime for unimon, which gives the best ratio between anharmonicity and T2.
-
(2024)The traditional method for identifying sulfate soils has been the incubation method, which typically takes 9-19 weeks. However, in collaboration, the Finnish Environment Institute (SYKE), Geological Survey of Finland (GTK), and Åbo Akademi developed a faster hydrogen peroxide oxidation method for identifying sulfate soils and assessing acidity potential. This method allows for sulfate soil identification and acidity potential estimation in just a few hours. The hydrogen peroxide oxidation method was used to identify sulfate soils in the Helsinki region and to evaluate the method. The study areas included the Sunnuntaipalsta-field area in Malmi, the area associated with the relocation of Gasgrid’s gas pipeline in Pihlajamäki, and the Hermanninranta- Kyläsaari area. Sulfate concentrations determined by the oxidation method were compared with concentrations obtained through water extraction at the Helsinki geophysical, environmental and mineralogical laboratories (Hellabs) of the University of Helsinki's Department of Geology and Geophysics, and acid extraction at ALS Finland Ltd. In Malmi, the method worked well and reliably, indicating naturally acidified soil with relatively low sulfur concentrations. Deeper layers revealed potential acidic sulfate soil materials. In Pihlajamäki, the method was effective, identifying clear potential acidic sulfate soils even with samples consisting of clay fillings. Challenges arose in the Hermanninranta-Kyläsaari area due to contaminated fill soils with high pH values and various hydrocarbons. The lower layers of the samples were rich in organic matter (LOI > 10%), causing the hydrogen peroxide oxidation method to overestimate sulfate concentrations, resulting in deviations with both acid and water extraction results. Based on the results, the hydrogen peroxide oxidation method performs most reliably when loss on ignition (LOI) is < 10% and the pH change (ΔpH) after oxidation is less than 5 units. The method could be a valuable addition to soil investigations conducted by the City of Helsinki's construction services public enterprise, Stara, in their Street and ground laboratory. The method is effective and enables the rapid identification of potential acidic sulfate soils.
-
(2024)In this master’s thesis, linear zwitterionic poly(ethylene imine) methyl-carboxylates (l-PEI-MCs) were synthesized through a four-step synthesis. The synthesis started with the polymerization of 2-ethyl-2-oxazoline (EtOx) monomers into poly(2-ethyl-2-oxazoline) (PEtOx) homopolymers with polymerization degree of 50 and 100. Living cationic ring-opening polymerization (LCROP) enabled a good control over the molecular weights. Subsequently, the side chains of PEtOxs were cleaved off by acidic hydrolysis. This resulted in linear poly(ethylene imine)s (l-PEIs) bearing a secondary amine group in repeating units of the polymer chain. These amine units were then functionalized with methyl-carboxylate moieties by first introducing tert-butyl ester functionalities to l-PEI chains, and subsequently cleaving off the tert-butyl groups. The final polymer is a polyzwitterion, featuring both an anionic carboxylate and a cationic tertiary amine group within a single repeating unit. Polymers produced in each step were characterized via 1H-NMR and FT-IR spectroscopy and their thermal properties were analyzed by differential scanning calorimetry (DSC). The molecular weights and dispersities (Ð) of PEtOx polymers were additionally estimated by gel permeation chromatography (GPC). Via 1H-NMR, the degree of polymerization for PEtOxs and the hydrolysis degree for l-PEIs were determined. FT-IR gave a further insight into the structures of polymers, successfully confirming the ester functionality of modified l-PEI. The disappearance of the tert-butyl proton signal in 1H-NMR spectrum after deprotection verified the successful removal of tert-butyl groups, resulting in the final product with methyl-carboxylate functionalities. By DSC, different thermal transitions, i.e., glass transition (Tg), melting (Tm) and crystallization (Tc), were observed, and the effects of molar mass and polymer modifications on these transitions were being investigated. The state of the art explores the literature regarding synthesis and properties of poly(2-oxazoline)s (POx), poly(ethylene imine)s (PEIs), and polyzwitterions. The theory behind living cationic ring-opening polymerization of 2-oxazolines and acidic hydrolysis of POxs is described. Different post-polymerization modification strategies to functionalize PEIs are being discussed. In addition, possible applications for each of these polymer classes are shortly outlined.
-
(2024)Sums of log-normally distributed random variables arise in numerous settings in the fields of finance and insurance mathematics, typically to model the value of a portfolio of assets over time. In particular, the use of the log-normal distribution in the popular Black-Scholes model allows future asset prices to exhibit heavy tails whilst still possessing finite moments, making the log-normal distribution an attractive assumption. Despite this, the distribution function of the sum of log-normal random variables cannot be expressed analytically, and has therefore been studied extensively through Monte Carlo methods and asymptotic techniques. The asymptotic behavior of log-normal sums is of especial interest to risk managers who wish to assess how a particular asset or portfolio behaves under market stress. This motivates the study of the asymptotic behavior of the left tail of a log-normal sum, particularly when the components are dependent. In this thesis, we characterize the asymptotic behavior of the left and right tail of a sum of dependent log-normal random variables under the assumption of a Gaussian copula. In the left tail, we derive exact asymptotic expressions for both the distribution function and the density of a log-normal sum. The asymptotic behavior turns out to be closely related to Markowitz mean-variance portfolio theory, which is used to derive the subset of components that contribute to the tail asymptotics of the sum. The asymptotic formulas are then used to derive expressions for expectations conditioned on log-normal sums. These formulas have direct applications in insurance and finance, particularly for the purposes of stress testing. However, we call into question the practical validity of the assumptions required for our asymptotic results, which limits their real-world applicability.
-
(2024)This thesis discusses short-term parking pricing in the context of Finnish shopping centre parking halls. The focus is on one shopping centre located in Helsinki where parking fees are high and there is a constant need for raising the prices. Therefore, it is important to have a strategy that maximises parking hall income without compromising the customers' interest. If the prices are too high, customers will choose to park elsewhere or reduce their parking in private parking halls. There is a lot of competition with off-street parking competing against on-street parking and access parking, not to mention other parking halls. The main goal of this thesis is to raise problems with parking pricing and discuss how to find the most beneficial pricing method. To achieve this, this thesis project conducted an analysis on one Finnish shopping centre parking hall data. This data was analysed to discover the average behaviour of the parkers and how the raised parking fees affect both the parker numbers and the income of the parking hall. In addition, several pricing strategies from literature and real-life examples were discussed and evaluated, and later combined with the analysis results. The results showed that there are some similarities with results from literature but there were some surprising outcomes too. It seems that higher average hourly prices are correlated with longer stays, but still the parkers who tend to park longer have more inelastic parking habits than those who park for shorter durations. The calculated price elasticity of demand values show that compared to other parking halls, parking is on average more elastic in the analysed parking hall. This further emphasises the importance of milder price raises at least for the shorter parking durations. Moreover, there are noticeable but explainable characteristics in parker behaviour. Most of the parkers prefer to park for under one hour to take advantage of the first parking hour being free. This leads to profit losses in both the shopping centre and parking hall income. Therefore, a dynamic pricing strategy is suggested as one pricing option, since it adjusts the prices automatically based on occupancy rates. Although there are some challenges with this particular method, in the long run it could turn out to be the most beneficial for both the parking hall owners and the parkers. To conclude, choosing a suitable pricing strategy and model for a parking hall is crucial and the decisions should be based on findings from data.
-
(2024)Machine Learning (ML) has experienced significant growth, fuelled by the surge in big data. Organizations leverage ML techniques to take advantage of the data. So far, the focus has predominantly been on increasing the value by developing ML algorithms. Another option would be to optimize resource consumption to reach cost optimality. This thesis contributes to cost optimality by identifying and testing frameworks that enable organizations to make informed decisions on cost-effective cloud infrastructure while designing and developing ML workflows. The two frameworks we introduce to model Cost Optimality are: "Cost Optimal Query Processing in the Cloud" for data pipelines and "PALEO" for ML model training pipelines. The latter focuses on estimating the training time needed to train a Neural Net, while the first one is more generic in assessing cost-optimal cloud setup for query processing. Through the literature review, we show that it is critical to consider both the data and ML training aspects when designing a cost-optimal ML workflow. Our results indicate that the frameworks provide accurate estimates about cost-optimal hardware configuration in the cloud for ML workflow. There are deviations when we dive into the details: our chosen version of the Cost Optimal Model does not consider the impact of larger memory. Also, the frameworks do not provide accurate execution time estimates: PALEO estimates our accelerated EC2 instance to execute the training workload with half of the time it took. However, the purpose of the study was not to provide accurate execution or cost estimates, but we aimed to see if the frameworks estimate the cost-optimal cloud infrastructure setup among the five EC2 instances that we chose to execute our three different workloads.
-
(2024)In this thesis a Retrieval-Augmented Generation (RAG) based Question Answering (QA) system is implemented. The RAG framework is composed of three components: a data storage, a retriever and a generator. To evaluate the performance of the system, a QA dataset is created from Prime minister Orpo's Government Programme. The QA pairs are created by human and also generated by using transformer-based language models. Experiments are conducted by using the created QA dataset to evaluate the performance of the different options to implement the retriever (both traditional algorithmic and transformer-based language models) and generator (transformer-based language models) components. The language model options used in the generator component are the same which were used for generating QA pairs to the QA dataset. Mean reciprocal rank (MRR) and semantic answer similarity (SAS) are used to measure the performance of the retriever and generator component, respectively. The used SAS metric turns out to be useful for providing an aggregated level view on the performance of the QA system, but it is not an optimal evaluation metric for every scenario identified in the results of the experiments. Inference costs of the system are also analysed, as commercial language models are included in the evaluation. Analysis of the created QA dataset shows that the language models generate questions that tend to reveal information from the underlying paragraphs, or the questions do not provide enough context, making the questions difficult to answer for the QA system. The human created questions are diverse and thus more difficult to answer compared to the language model generated questions. The QA pair source affects the results: the language models used in the generator component receive on average high score answers to QA pairs which they had themselves generated. In order to create a high quality QA dataset for QA system evaluation, human effort is needed for creating the QA pairs, but also prompt engineering could provide a way to generate more usable QA pairs. Evaluation approaches for the generator component need further research in order to find alternatives that would provide an unbiased view to the performance of the QA system.
-
(2024)Compression methods are widely used in modern computing. With the amount of data stored and transferred by database systems constantly increasing, the implementation of compression methods into database systems has been studied from different angles during the past four decades. This thesis studies the scientific methods used in relational database research. The goal of the thesis is to evaluate the methods employed and to gain an understanding into how research into the subject should be conducted. A literature review is conducted. 14 papers are identified for review and their methodology is described and analysed. The papers reviewed are used to answer four research question and classified according to insights gained during the review process. There are similarities in methods of different papers that can be described to use as a starting point for conducting research in the field of database compression.
-
(2024)Buildings consume approximately 40% of global energy, hence, understanding and analyzing energy consumption patterns of buildings is essential in bringing desirable insights to building management stakeholders for better decision-making and energy efficiency. Based on a specific use case of a Finnish building management company, this thesis presents the challenge of optimizing energy consumption forecasting and building management by addressing the shortcomings of current individual building-level forecasting approaches and the dynamic nature of building energy use. The research investigates the plausibility of a system of building clusters by studying the representative cluster profiles and dynamic cluster changes. We focus on a dataset comprising hourly energy consumption time series from a variety of Finnish university buildings, employing these as subjects to implement a novel stream clustering approach called ClipStream. ClipStream is an attibute-based stream clustering algorithm to perform continuous online clustering of time series data batches that involves iterative data abstraction, clustering, and change detection phases. This thesis shows that it was plausible to build clusters of buildings based on energy consumption time series. 23 buildings were successfully clustered into 3-5 clusters during each two-week window of the period of investigation. The study’s findings revealed distinct and evolving energy consumption clusters of buildings and characterized 7 predominant cluster profiles, which reflected significant seasonal variations and operational changes over time. Qualitative analyses of the clusters primarily confirmed the noticeable shifts in energy consumption patterns from 2019 to 2022, underscoring the potential of our approach to enhance forecasting efficiency and management effectiveness. These findings could be further extended to establish energy policy, building management practices, and broader sustainability efforts. This suggests that improved energy efficiency can be achieved through the application of machine learning techniques such as cluster analysis.
Now showing items 21-40 of 4261