Often it would be useful to be able to calculate the 'shape' of a set of points even though it is not clear what is formally meant by the word 'shape.' The definitions of shapes presented in this thesis are generalizations of the convex hull of a set of points P which is the smallest convex set that contains all points of P.
The k-hull of a point set P is a generalization of the convex hull. It is the complement of the union of all such halfplanes that contain at most k points of P. One application of the k-hull is measuring the data depth of an arbitrary point of the plane with respect to P. Another generalization of the convex hull is the α-hull of a point set P. It is the complement of the union of all α-disks (disks of radius α) that contain no points of the set P in their interior. The value of α controls the detail of the α-hull: 0-hull is P and ∞-hull is the convex hull of P. The α-shape is the 'straight-line' version of the α-hull. The α-hull and the α-shape reconstruct the shape in a more intuitive manner than the convex hull, recognizing that the set might have multiple distinct data clusters. However, α-hull and α-shape are very prone to outlier data that is often present in real-life datasets. A single outlier can drastically change the output shape. The k-order α-hull is a generalization of both the k-hull and the α-hull and as such it is a link between statistical data depth and shape reconstruction. The k-order α-hull of a point set P is the complement of the union of all such α-disks that contain at most k points of P. The k-order α-shape is the α-shape of those points of P that are not included in any of the α-disks. The k-order α-hull and the k-order α-shape can ignore a certain amount of the outlier data which the α-hull and the α-shape cannot. The detail of the shape can be controlled with the parameter α and the amount of outliers ignored with the parameter k. For example, the 0-order α-hull is the α-hull and the k-order ∞-hull is the k-hull.
One possible application of the k-order α-shape is a visual representation of spatial data. Multiple k-order α-shapes can be drawn on a map so that shapes that are deeper in the dataset (larger values of k) are drawn with more intensive colors. Example datasets for geospatial visualization in this thesis include motor vehicle collisions in New York and unplanned stops of public transportation vehicles in Helsinki.
Another application presented in this thesis is noise reduction from seismic time series using k-order α-disks. For each time tick, two disks of radius α are put above and below the data points. The upper disk is pulled downwards and the lower disk upwards until they contain exactly k data points inside. The algorithm's output for each time tick is the average of the centres of these two α-disks. With a good choice of parameters α and k, the algorithm successfully restores a good estimate of the original noiseless time series.