Feature extraction techniques eliminate redundant features, effectively reducing the dimensionality of the dataset, and can also discover hidden “latent variables”. Linear feature extraction techniques are generally similar to optimal projections onto a desired number of features (in the sense that they minimize MSE), while nonlinear feature extraction techniques can unravel the variables behind complex datasets and can be used for “manifold learning” in a process remarkably similar to the concept of abstraction in thought. Feature extraction techniques are generally considered unsupervised and are extremely powerful and useful.
Linear: I have a dataset with an age field and a date of birth field and I'm looking to reduce the redundancy.
Nonlinear: I have a set of teapot images that are rotated at different angles. I'd like to compare teapot images as a whole by their angle of rotation instead of comparing each pixel.
Principal Component Analysis, Singular Value Decomposition, Latent Semantic Indexing, Independent Component Analysis, Kernel PCA, Local Linear Embedding, Isomap, Maximum Variance Unfolding.
The proportion of the variance captured by the indicated principal components against the resulting dimensionality. For example, a "good" reduction may be reduction of a 50-dimensional dataset to 3 dimensions which capture 90% of the variance of the original, since the reduction is large and the loss of information is minimal. This is commonly expressed in terms of eigenvectors and may be displayed graphically in a plot called an eigenspectrum.