Linear Models with Regularization

Linear Models with Regularization

dc.date.accessioned	2012-10-03T18:49:27Z	und
dc.date.accessioned	2017-10-24T12:22:28Z
dc.date.available	2012-10-03T18:49:27Z	und
dc.date.available	2017-10-24T12:22:28Z
dc.date.issued	2012-10-03T18:49:27Z
dc.identifier.uri	http://radr.hulib.helsinki.fi/handle/10138.1/1959	und
dc.identifier.uri	http://hdl.handle.net/10138.1/1959
dc.title	Linear Models with Regularization	en
ethesis.discipline	Statistics	en
ethesis.discipline	Tilastotiede	fi
ethesis.discipline	Statistik	sv
ethesis.discipline.URI	http://data.hulib.helsinki.fi/id/670ef0b6-2f9e-4e98-91af-a292298fb670
ethesis.department.URI	http://data.hulib.helsinki.fi/id/61364eb4-647a-40e2-8539-11c5c0af8dc2
ethesis.department	Institutionen för matematik och statistik	sv
ethesis.department	Department of Mathematics and Statistics	en
ethesis.department	Matematiikan ja tilastotieteen laitos	fi
ethesis.faculty	Matematisk-naturvetenskapliga fakulteten	sv
ethesis.faculty	Matemaattis-luonnontieteellinen tiedekunta	fi
ethesis.faculty	Faculty of Science	en
ethesis.faculty.URI	http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca
ethesis.university.URI	http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97
ethesis.university	Helsingfors universitet	sv
ethesis.university	University of Helsinki	en
ethesis.university	Helsingin yliopisto	fi
dct.creator	Huang, Zhiyong
dct.issued	2012
dct.language.ISO639-2	eng
dct.abstract	In this master's thesis we present two important classes of regularized linear models -regularized least squares regression (LS) and regularized least absolute deviation (LAD). Use of regularized regression in variable selection was pioneered by Tibshirani (1996) and his proposed LASSO rapidly became a popular and competitive method in variable selection. Properties of LASSO have been intensively studied and different algorithms to solve LASSO have been developed. While the success of LASSO was acclaimed during the process, its limitations were noticed and a number of alternative methods have been proposed in subsequent research. Among all of theses methods, adaptive LASSO (Zou, 2006) and SCAD (Fan and Li, 2001) attempt to improve the efficiency of LASSO; LAD LASSO (Wang et al., 2007) assumes non-Gaussian distributed errors; ridge, elastic net (Zou and Hastie, 2005) and Bridge (Frank and Friedman, 1993) adopt penalties other than L1; while fused LASSO (Tibshirani et al., 2005) and grouped LASSO (Yuan and Lin, 2006) take extra constrains of data into account. We discuss LASSO in length in the thesis. Its properties in orthogonal design, singular design and p > n design are examined. Its asymptotic performance is investigated and its limitations are carefully illustrated. Another two commonly used regularization methods in LS - ridge and elastic net - are discussed as well. The regularized LAD is another focus of the thesis. As a robust statistic, LAD, which fits the conditional median rather than the conditional mean of the response, has a bounded influence function and a high conditional breakdown point. It is natural to use regularized LAD to do variable selection in presence of long-tailed errors or outliers in the response. Compared with LASSO, LAD LASSO does robust estimation and variable selection simultaneously. We make a simulation study and examine two real examples on the performance of these regularized linear models. Our results demonstrate that no single one estimate dominates others in all cases. The sparsity of the true model, the distribution of the noise, noise-to-signal ratio, the sample size and the correction of predictors, all these factors matter. When the noise has a normal distribution, LASSO, adaptive LASSO and elastic net often outperform others in prediction accuracy. Adaptive LASSO is the best in variable selection and elastic net tends to reveal less sparsity than LASSO. When the noise follows a Laplace distribution, LAD LASSO is competitive with LASSO but is less efficient than adaptive LASSO. For noises with extremely long-tailed distribution such as Cauchy distribution, LAD LASSO dominates others in both the prediction accuracy and variable selection.	en
dct.language	en
ethesis.language.URI	http://data.hulib.helsinki.fi/id/languages/eng
ethesis.language	English	en
ethesis.language	englanti	fi
ethesis.language	engelska	sv
ethesis.thesistype	pro gradu-avhandlingar	sv
ethesis.thesistype	pro gradu -tutkielmat	fi
ethesis.thesistype	master's thesis	en
ethesis.thesistype.URI	http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis
ethesis.degreeprogram	Bayesian Statistics and Decision Analysis	en
dct.identifier.urn	URN:NBN:fi-fe2017112251712
dc.type.dcmitype	Text

Files in this item

Files	Size	Format	View
Linear_Models_with_Regularization_1002_Zhiyong.pdf	612.1Kb	PDF

This item appears in the following Collection(s)

Faculty of Science [4247]

Show simple item record

Linear Models with Regularization

Files in this item

This item appears in the following Collection(s)

Yhteystiedot

HELSINGIN YLIOPISTO