Feature selection and extraction:
The goal of this process is to identify and remove as much irrelevant and redundant information as possible, not only that but also engineers new features. This reduces the dimensionality of the data, provides a better discriminative ability and allows faster training.
Some feature selection and engineering techniques: Dimensionality reduction algorithms: These algorithms select the basic components(features) that ensure as much variability as possible between observations, some of these algorithms are PCA, tSNE, UMAP and Autoencoders. Persistent Homology: Method based on algebraic topology. Persistent homology can discover important features that capture the global geometry of the dataset, which can give immense boost in model’s generalization.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.