Skip to main content
Login | Suomeksi | På svenska | In English

Correcting the Effects of Missing Data in Helsinki Psychotherapy Study using Multiple Imputation

Show simple item record

dc.date.accessioned 2015-11-26T06:17:54Z und
dc.date.accessioned 2017-10-24T12:21:52Z
dc.date.available 2015-11-26T06:17:54Z und
dc.date.available 2017-10-24T12:21:52Z
dc.date.issued 2015-11-26T06:17:54Z
dc.identifier.uri http://radr.hulib.helsinki.fi/handle/10138.1/5162 und
dc.identifier.uri http://hdl.handle.net/10138.1/5162
dc.title Correcting the Effects of Missing Data in Helsinki Psychotherapy Study using Multiple Imputation en
ethesis.department.URI http://data.hulib.helsinki.fi/id/61364eb4-647a-40e2-8539-11c5c0af8dc2
ethesis.department Institutionen för matematik och statistik sv
ethesis.department Department of Mathematics and Statistics en
ethesis.department Matematiikan ja tilastotieteen laitos fi
ethesis.faculty Matematisk-naturvetenskapliga fakulteten sv
ethesis.faculty Matemaattis-luonnontieteellinen tiedekunta fi
ethesis.faculty Faculty of Science en
ethesis.faculty.URI http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca
ethesis.university.URI http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97
ethesis.university Helsingfors universitet sv
ethesis.university University of Helsinki en
ethesis.university Helsingin yliopisto fi
dct.creator Yi, Xinxin
dct.issued 2015
dct.language.ISO639-2 eng
dct.abstract Problem: Helsinki psychotherapy study (HPS) is a quasi-experimental clinical trial, which is designed to compare the effects of different treatments (i.e. psychotherapy and psychoanalysis) on patients with mood and anxiety disorders. During its 5-year follow-ups from the year 2000 to 2005, repeated measurements were carried out at 0, 12, 24, 36, 48, 60 months. However, some individuals did not show up at certain data collection points or dropped out of the study forever, leading to the occurrence of missing values. This will prevent the applications of further statistical methods and violate the intention-to-treat (ITT) principle in longitudinal clinical trials (LCT). Method: Multiple Imputation (MI) has many claimed advantages in handling missing values. This research will compare different MI methods i.e. Markov chain Monte Carlo (MCMC), Bayesian Linear Regression (BLR), Predictive Mean Matching (PMM), Regression Tree (RT), Random Forest (RF) in their treatments of HPS missing data. The statistical software is SAS PROC MI procedure (version 9.3) and R MICE package (version 2.9). Results: MI has better performance than the ad-hoc methods such as listwise deletion in the detections of potential relationships and the reduction of potential biases in parameter estimations if missing completely at random (MCAR) assumption is not satisfied. PMM, RT and RF have better performance in generating imputed values inside the range of the observed data than BLR and MCMC. The machine learning methods i.e. RT and RF are preferable than the regression methods such as PMM and BLR since the imputed data have quite similar distribution curves and other features (e.g. median, interguatile, skewness of distribution) as the observed data. Implications: It is suggestive to use MI methods to replace those ad-hoc methods in the treatments of missing data, if additional efforts and time are not a problem. The machine learning methods such as RT and RF are more preferable than those relatively arbitrary user-specified regression methods such as PMM and BLR according to our data, but further research are required to approve this indication. R is more flexible than SAS where RT and RF can be applied. en
dct.language en
ethesis.language.URI http://data.hulib.helsinki.fi/id/languages/eng
ethesis.language English en
ethesis.language englanti fi
ethesis.language engelska sv
ethesis.thesistype pro gradu-avhandlingar sv
ethesis.thesistype pro gradu -tutkielmat fi
ethesis.thesistype master's thesis en
ethesis.thesistype.URI http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis
ethesis.degreeprogram Bayesian Statistics and Decision Analysis en
dct.identifier.urn URN:NBN:fi-fe2017112251653
dc.type.dcmitype Text

Files in this item

Files Size Format View
Master Thesis Yi Xinxin 014302752.pdf 15.78Mb PDF

This item appears in the following Collection(s)

Show simple item record