Dimensionality Reduction Algorithms
Feature Reduction or dimensionality reduction is a crucial step in machine learning, when a dataset contains many variables or characteristics for each individual element in the dataset, especially when comparing the number of records to the number of variables for each record. Feature extraction involves the transformation of the raw data into new features that is the original feature space is mapped onto a new, reduced feature space – a dataset with fewer variables that might be generated through combinations of the original variables. The new reduced space can be created either by selecting a subset of the original, adding new dimensions or selecting from both new and original dimensions. Reduction can be performed using multiple methods for filtering such as Variance, Correlation, Random Forests, Principal Component Analysis (PCA) and backward/forward feature construction or elimination (By minimising the model training error).
Applications
Take PCA as an example:
- Visualize microarray gene expression data ref.
- Discover biomarker discovery from high-throughput biological data ref.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.