Browsing by Author "Qaiser, Beenish"
Now showing items 11 of 1

Qaiser, Beenish (2015)The main focus of this thesis is on the use of Newtonbased optimization methods for the optimization of an objective function that is used in the estimation of unnormalized statistical models. For these models, the probability density function (pdf) is known only up to a multiplicative normalizing factor. A properly normalized pdf is essential in maximum likelihood estimation (MLE) for density estimation. An unnormalized model can be converted into a normalized one by diving it by its integral (or sum) known as the partition function. We can compute the partition function analytically or approximate it using numerical integration methods. Here, we assume that the partition function is not available in a closed form. This makes MLE unsuitable for density estimation. We use a method known as noisecontrastive estimation (NCE) for density estimation of unnormalized models. This method does not rely on numerical integration to approximate the partition function. It estimates the normalizing constant along with the other unknown quantities of the model. The estimation process is based on the optimization of a welldefined objective function also known as the crossentropy error function. There are no constraints in the optimization and hence, powerful optimization methods designed for unconstrained optimization can be used. Currently, a firstorder optimization method known as the nonlinear conjugate gradient (CG) method is being used. However, it has been shown that this method converges at a slow rate in case of large datasets. It is possible to use only a fraction of input samples (data and noise) in order to reduce the computation time of the algorithm. This technique is known as sample average approximation (SAA). However, accuracy of the estimates is compromised when random subsets of input samples are used in order to improve the computational performance of the nonlinear CG method. There exists a tradeoff between statistical accuracy of the estimates and computational performance of the algorithm. We propose to use the secondorder Newtonbased optimization methods such as the line search NewtonCG and the trust region NewtonCG methods. These methods produce better search directions than the nonlinear CG method as they employ both the gradient and the Hessian of the objective function. However, the Newton method requires the Hessian to be positive definite in order to make progress. Thus, we use the GaussNewton approximation to the Hessian to avoid directions of negative curvature in case they occur. Furthermore, every iteration of the Newton method is computationally intensive as it requires computation of the Hessian and its inverse. We integrate the NewtonCG methods with the SAA framework to provide an efficient solution. The gradient is computed using whole sets of input samples whereas the Hessian is computed using random subsets. As a result, we are able to reduce the computation times of the NewtonCG algorithms without losing the statistical accuracy of the estimates. It is shown that the trust region strategy computationally performs better than the line search strategy. The NewtonCG methods converge faster and do not compromise the accuracy of the estimates even when random subsets consisting of 10% of the input samples only are used during the optimization. This is a considerable improvement over the currently employed nonlinear CG method.
Now showing items 11 of 1