Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "Principal component analysis"

Sort by: Order: Results:

  • Malila, Saara (2024)
    The presence of 1/f type noise in a variety of natural processes and human cognition is a well-established fact, and methods of analysing it are many. Fractal analysis of time series data has long been subject to limitations due to the inaccuracy of results for small datasets and finite data. The development of artificial intelligence and machine learning algorithms over the recent years have opened the door to modeling and forecasting such phenomena as well which we do not yet have a complete understanding of. In this thesis principal component analysis is used to detect 1/f noise patterns in human-played drum beats typical to a style of playing. In the future, this type of analysis could be used to construct drum machines that mimic the fluctuations in timing associated with a certain characteristic in human-played music such as genre, era, or musician. In this study the link between 1/f-noisy patterns of fluctuations in timing and the technical skill level of the musician is researched. Samples of isolated drum tracks are collected and split into two groups representing either low or high level of technical skill. Time series vectors are then constructed by hand to depict the actual timing of the human-played beats. Difference vectors are then created for analysis by using the least-squares method to find the corresponding "perfect" beat and subtracting them from the collected data. These resulting data illustrate the deviation of the actual playing from the beat according to a metronome. A principal component analysis algorithm is then run on the power spectra of the difference vectors to detect points of correlation within different subsets of the data, with the focus being on the two groups mentioned earlier. Finally, we attempt to fit a 1/f noise model to the principal component scores of the power spectra. The results of the study support our hypothesis but their interpretation on this scale appears subjective. We find that the principal component of the power spectra of the more skilled musicians' samples can be approximated by the function $S=1/f^{\alpha}$ with $\alpha\in(0,2)$, which is indicative of fractal noise. Although the less skilled group's samples do not appear to contain 1/f-noisy fluctuations, its subsets do quite consistently. The opposite is true for the first-mentioned dataset. All in all, we find that a much larger dataset is required to construct a reliable model of human error in recorded music, but with the small amount of data in this study we show that we can indeed detect and isolate defining rhythmic characteristics to a certain style of playing drums.
  • Flinck, Jens (2023)
    This thesis focuses on statistical topics that proved important during a research project involving quality control in chemical forensics. This includes general observations about the goals and challenges a statistician may face when working together with a researcher. The research project involved analyzing a dataset with high dimensionality compared to the sample size in order to figure out if parts of the dataset can be considered distinct from the rest. Principal component analysis and Hotelling's T^2 statistic were used to answer this research question. Because of this the thesis introduces the ideas behind both procedures as well as the general idea behind multivariate analysis of variance. Principal component analysis is a procedure that is used to reduce the dimension of a sample. On the other hand, the Hotelling's T^2 statistic is a method for conducting multivariate hypothesis testing for a dataset consisting of one or two samples. One way of detecting outliers in a sample transformed with principal component analysis involves the use of the Hotelling's T^2 statistic. However, using both procedures together breaks the theory behind the Hotelling's T^2 statistic. Due to this the resulting information is considered more of a guideline than a hard rule for the purposes of outlier detection. To figure out how the different attributes of the transformed sample influence the number of outliers detected according to the Hotelling's T^2 statistic, the thesis includes a simulation experiment. The simulation experiment involves generating a large number of datasets. Each observation in a dataset contains the number of outliers according to the Hotelling's T^2 statistic in a sample that is generated from a specific multivariate normal distribution and transformed with principal component analysis. The attributes that are used to create the transformed samples vary between the datasets, and in some datasets the samples are instead generated from two different multivariate normal distributions. The datasets are observed and compared against each other to find out how the specific attributes affect the frequencies of different numbers of outliers in a dataset, and to see how much the datasets differ when a part of the sample is generated from a different multivariate normal distribution. The results of the experiment indicate that the only attributes that directly influence the number of outliers are the sample size and the number of principal components used in the principal component analysis. The mean number of outliers divided by the sample size is smaller than the significance level used for the outlier detection and approaches the significance level when the sample size increases, implying that the procedure is consistent and conservative. In addition, when some part of the sample is generated from a different multivariate normal distribution than the rest, the frequency of outliers can potentially increase significantly. This indicates that the number of outliers according to Hotelling's T^2 statistic in a sample transformed with principal component analysis can potentially be used to confirm that some part of the sample is distinct from the rest.